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Preface 



Like any human language, C++ provides a way to express 
concepts. If successful, this medium of expression will be 
significantly easier and more flexible than the alternatives as 
problems grow larger and more complex. 

You can't just look at C++ as a collection of features; some of the features make no sense in 
isolation. You can only use the sum of the parts if you are thinking about design, nol simply 
coding. And to understand C++ in this way, you must understand the problems with C and 
with programming in general. This book discusses programming problems, why they are 
problems, and the approach C++ has taken to solve such problems. Thus, the set of features I 
explain in each chapter will be based on the way that 1 see a particular type of problem being 
solved with the language. In this way 1 hope to move you, a little at a time, from 
understanding C to the point where the C++ mindset becomes your native tongue. 



Throughout, I'll be taking the attitude that you war 
you to understand the language all the way down tt 
you'll be able to feed it lo your model and deduce 
insights which have rearranged my brain to make r 



t to build a model in your head that allows 
the bare metal; if you encounter a puzzle 
lie answer. I will try to convey to you the 
e start "thinking in C++." 



What's new in the second 
edition 
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to C++ or Java (leaving oul the nasly bits thai C programmers must deal with on a day-to-day 
basis but thai the C++ and Java languages sleer you away from). 

So the short answer is: what isn't brand new has been rewritten, sometimes to the point where 
you wouldn't recognize the original examples and material. 

What' s in Volume 2 of this book 

siring and the Standard Template Library (STL) as well as new complexity in templates. 
These and other more advanced topics have been relegated to Volume 2 of this book, 
including issues like multiple inheritance, exception handling, design patterns and topics 
about building stable systems and debugging them. 



How to get Volume 2 



lusl lite lie fcooMliil yoi cirriiillY fcolJ, Thinking in C + +, Volume 2 is freely downloadable 
in its entirety from my web site at www.BruceEckel.com . The final version of Volume 2 will 
be completed and printed in late 2000 or early 2001. 

The web site also contains the source code for both the books, along with updates and 
information about CD ROMs, public seminars, and in-house training, consulting, mentoring 
and walk-throughs. 



Prerequisites 
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up ii iil[i 1 1 rliii li C , 1 Id 1 ; 1 i[l ll t Thinking in C seminar-on-CD, but still assuming that 
you have some kind of programming experience already, in addition, just as you learn many 
new words intuitively by seeing them in context in a novel, it's possible to learn a great deal 
about C from the context in which it is used in the rest of the book. 



Learning C++ 
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and other "sophisticated" concepts, scuttling away in shame when the subjects came up in 
conversation rather than reaching out for new knowledge. 

When I began my struggle to understand C++, the only decent book was Stroustrup's self- 
professed "expert's guide,' " so I was left to simplify the basic concepts on my own. This 
resulted in my first C++ book,^ which was essentially a brain dump of my experience. That 
was designed as a reader's guide, to bring programmers into C and C++ at the same time. 
Both editions^ of the book garnered an enthusiastic response. 

At about the same time that Using C+ + came out, I began teaching the language in live 
seminars and presentations. Teaching C++ (and later, Java) became my profession; I've seen 
nodding heads, blank faces, and puzzled expressions in audiences all over the world since 
1989. As I began givmg in-house training with smaller groups of people, 1 discovered 
something during the exercises. Even those people who were smiling and nodding were 
confused about many issues. I found out, by creating and chairing the C++ and Java tracks at 
the Software Development Conference for many years, that 1 and other speakers tended to 
give the typical audience too many topics, too fast. So eventually, through both variety in the 
audience level and the way that I presented the material, I would end up losing some portion 
of the audience. Maybe it's asking too much, but because I am one of those people resistant to 
traditional lecturing (and for most people, I believe, such resistance results from boredom), I 
wanted to try to keep everyone up to speed. 

For a time, I was creating a number of different presentations in fairly short order. Thus, 1 
ended up learnmg by experiment and iteration (a technique that also works well in C++ 
program design). Eventually I developed a course using everything 1 had learned from my 
teaching experience. It tackles the learning problem in discrete, easy-to -digest steps and for a 
hands-on seminar (the ideal learning situation), there are exercises following each of the 
presentations. 

The first edition of this book developed over the course of two years, and the material in this 
book has been road-tested in many forms in many different seminars. The feedback that I've 
gotten from each seminar has helped me change and refocus the material until 1 feel it works 

information as 1 could within these pages, and structure it to draw you through, onto the next 
subject. More than anything, the book is designed to serve the solitary reader, struggling with 
a new programming language. 



Bjame Slrouslrup, The C++ Progiamming Language, Addison -Wesley, I 

2 Using C++, Osboriie/McGraw-Hill 1989. 

3 Using C++ and C + + Inside & Our. Osborne/McGraw-Hill 1993. 



Goals 



goals in thi; 


5 book are to: 




1. 


Present the material a simple ste[ 


) at a time, so the reader can easily digest 




each concept before moving on. 




2. 


Use examples that are as simple : 


and short as possible. This sometimes 




prevents me from tackling "real-i 


A'orld" problems, but I've found that 




beginners are usually happier wli 


en they can understand every detail of an 




example rather than being impres 


ised by the scope of the problem it solves 




Also, there's a severe limit to the 


amount of code that can be absorbed in a 




classroom situation. For this 1 so 


nietimes receive criticism for using "toy 




examples," but I'm willing to ace 


:ept that in favor of producing something 



Carefully sequence the presentation of features so that you aren't seeing 
something you haven't been exposed to. Of course, this isn't always 
possible; in those situations, a brief introductory description will be given. 

Give you what I think is important for you to understand about the 
language, rather than everything I know. I believe there is an "information 
importance hierarchy," and there are some facts that 95% of programmers 
will never need to know, but that would jusi confuse people and add to their 
perception of the complexity of the language. To take an example from C, if 
you memorize the operator precedence table (I never did) you can write 
clever code. Bui if you have to think about it, it will confuse the 
reader/in aintainer of thai code. So forget about precedence, and use 
parentheses when things aren't clear. This same attitude will be taken with 
some information in the C++ language, which I think is more important for 
compiler writers than for programmers. 

Keep 
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documentation concerning their own implementation specifics is adequat 



Chapters 



(1 ( 1 1 1 1( 1 [ Ih h il is [i [([[(J [1 IS 1 Ziyi? rfif object-oriented programming language.) As more 
people have passed through the learning curve, we've begun to get a feel for the way 
programmers move through the stages of the C++ language features. Because it appears to be 
the natural progression of the procedurally-trained mind, I decided to understand and follow 
this same path, and accelerate the process by posing and answering the questions that came to 
me as I learned the language and that came from audiences as I taught it. 

This course was designed with one thing in mind: to streamline the process of learning the 
C++ language. Audience feedback helped me understand which parts were difficult and 
needed extra illumination. In the areas where I got ambitious and included too many features 
all at once, I came to know - through the process of presenting the material - that if you 
include a lot of new features, you have to explain them all, and the student' s confusion is 
easily compounded. As a result, I've taken a great deal of trouble to introduce the features as 
few at a time as possible; ideally, only one major concept at a time per chapter. 

The goal, then, is for each chapter to teach a single concept, or a small group of associated 
concepts, in such a way that no additional features are relied upon. That way you can digest 
each piece in thecontext of your current knowledge before moving on. To accomplish this, I 
leave some C features in place for longer than I would prefer. The benefit is that you will not 
be confused by seeing all the C++ features used before they are explained, so your 
introduction to the language will be gentle and will mirror the way you will assimilate the 
features if left to your own devices. 

Here is a brief description of the chapters contained in this book: 

(5) Introduction to iostreams. One of the original C++ libraries - the one that provides the 
essential I/O facility - is called iostreams. Iostreams is intended to replace C's stdio.h with an 
I/O library that is easier to use, more flexible, and extensible - you can adapt it to work with 
your new classes. This chapter teaches you the ins and outs of how to make the best use of the 
existing iostream library for standard I/O, file I/O, and in-memory formatting. 

(15) Multiple inheritance. This sounds simple at first: A new class is inherited from more 
than one existing class. However, you can end up with ambiguities and multiple copies of 
base-class objects. That problem is solved with virtual base classes, but the bigger issue 
remains: When do you use it? Multiple inheritance is only essential when you need to 
manipulate an object through more than one common base class. This chapter explains the 
syntax for multiple inheritance, and shows alternative approaches - in particular, how 
templates solve one common problem. The use of multiple inheritance to repair a "damaged" 
class interface is demonstrated as a genuinely valuable use of this feature. 



(16) Exception handling. Error handling has always been a problem in programming. Even if 
you duiifully return error information or set a flag, the function caller may simply ignore it. 
Exception handling is a primary feature in C++ that solves this problem by allowing you to 
"throw" an object out of your function when a critical error happens. You throw different 
types of objects for different errors, and the function caller "catches" these objects in separate 
error handling routines. If you throw an exception, it cannot be ignoied, so you can guarantee 
that something will happen in response to your error. 

(17) Run-time type identification. Run-time type identification (RTTl) lets you find the 
exact type of an object when you only have a pointer or reference to the base type. Normally, 
you'll want to intentionally ignore the exact type of an object and let the virtual fiinction 
mechanism implement the correct behavior for that type. But occasionally it is very helpful to 
know the exact type of an object for which you only have a base pointer; often this 
information allows you to perform a special-case operation more efficiently. This chapter 
explains what RTTl is for and how to use it. 
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Exercise solutions 



1 1)£ foiiml in lilt tIeclJODic daciiiD ent The C++ Annotated Solution 
Guide, Volume 2 by Chuck Allison, available for a small fee from www.BruceEckel.com. [[ 



Note this is not yet available ]] 

Source code 
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site. The program will create a directory for each chapter and unpack the code into those 
directories. In the starting directory where you unpacked the code you will find the followin 
copyright notice: 

//: ! :CopyRight .txt 

Copyright (c) Bruce Eckel, 1999 

Source code file from the book "Thinking in C++" 

All rights reserved EXCEPT as allowed by the 

following statements: You can freely use this file 

for your own work (personal or commercial), 

including modifications and distribution in 

executable form only. Permission is granted to use 

this file in classr. 

"Thinking in C++" ii 

Except in classroom 

and distribute this 

di st r ibut ion po int 

(and official mirror sites) when 

freely available. You cannot 

copyright and notice. You cannot distribute 

modified versions of the source code in this 

package. You cannot use this file in printed 

media without the express permission of the 

author. Bruce Eckel makes no representation about 

the suitability of this software for any purpose. 

It is provided "as is" without express or implied 

warranty of any kind, including any implied 

warranty of merchantability, fitness for a 

particular purpose or non-inf ringement . The entire 

risk as to the guality and performance of the 

software is with you. Bruce Eckel and the 

publisher shall not be liable for any damages 

suffered by you or any third party as a result of 

using or distributing software. In no event will 

Bruce Eckel or the publisher be liable for any 

lost revenue, profit, or data, or for direct, 

indirect, special, consequential, 

punitive damages, however caused i 

the theory of liability, arising i 

or inability to use software, evei 

and the publisher have been adviss 

possibility of such damages. Shou. 

prove defective, you assume the ci 
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necessary servicing, repair, or correction. If you 

think you've found an error, please submit the 

correction using the form you will find at 

www.BruceEckel.coin. (Please use the same 

form for non-code errors found, in the book.) 

Ill:- 
You may use the code in your projects and in the classroom as long as the copyright n 
retained. 



Language standards 
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pre-Standard versions of C will I make the distinction. 

At this writing the ANSI/ISO C-H- committee was finished working on the language. Thus, I 
will use the term Standard C++ to refer to the standardized language. If 1 simply refer to C++ 
you should assume I mean "Standard C++." 
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walkthroughs. Information and sign-up forms for upcoming seminars and other contact 
information can be found at http://»ww.B niceEckel.coiii. 
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Part 1 : The 
Standard C++ 
Library 

Standard C++ not only incorporates all the Standard C 
libraries, with small additions and changes to support type 
safety, it also adds libraries of its own. These libraries are far 
more powerful than those in Standard C; the leverage you 
get from them is analogous to the leverage you get from 
changing from C to C++. 

This section of the book gives you an in-depth introduction to the most important portions of 
the Standard C++ library. 

The most complete and also the most obscure reference to the full libraries is the Standard 
itself Somewhat more readable (and yet still a self-described "expert's guide") is Bjarne 
Stroustrup's 3'" Edition of T/;e C++ Programming Language (Addison-Wesley, 1997). 
Another valuable reference is the 3' edition of C++ Primer, by Lippman & Lajoie. The goal 
of the chapters in this book that cover the libraries is to provide you with an encyclopedia of 
descriptions and examples so you'll have a good starting point for solving any problem that 
requires the use of the Standard libraries. However, there are some techniques and topics that 
are used rarely enough that they are not covered here, so if you can't find it in these chapters 
you should reach for the other two books; this book is not intended to replace those but rather 
to complement them. In particular, 1 hope that after going through the material in the 
following chapters you'll have a much easier time understanding those books. 

You will notice that this section does not contain exhaustive documentation describing every 
function and class in the Standard C++ library. I've left the full descriptions to others; in 
particular there a particularly good on-line sources of standard library documentation in 
HTML format that you can keep resident on your computer and view with a Web browser 
whenever you need to look something up. This is PJ Plauger's Dinkumware C/C++ Library 
reference at http://www.dinkumware.com. You can view this on-line, and purchase it for local 



viewing. It contains complete reference pages for the both the C and C++ libraries {so it's 
good to use for all your Standard C/C++ programming questions). 1 am particularly fond of 
electronic documentation not only because you can always have it with you, but also because 
you can do an electronic search for what you're seeking. 

When you're actively programming, these resources should adequately satisfy your reference 
needs (and you can use them to look up anything in this chapter that isn't clear to you). 
Appendix XX lists additional references. 



Library overview 
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lit li[il t li I p l( [ ii lUs ii [ IJg I ig til d D ( t i III t Stii d I rd [ ( f string class, which is a powerful 
tool that simplifies most of the text processing chores you might have to do. The string class 
may be the most thorough string manipulation tool you've ever seen. Chances are, anything 
you've done to character strings with lines of code in C can be done with a member function 
call in the string class, including append( ), assign( ), insert( ), remove( ), replace( ), 
resize( ), copy( ), nnd( ), rfindC ), find_fii«t_of( ), nndJast_of( ), find_Finit_not_of( ), 
find_last_not_of( ), SDbstr( ), and conipare( ). The operators =, +=, and [ ] are also 
overloaded to perform the intuitive operations. In addition, there's a "wide" tvstring class 
designed to support international character sets. Both string and wstring (declared in 
<string>, not to be confused with C's <string.h>, which is, in strict C++, <cstring>) are 
created from a common template class called basic_string. Note that the string classes are 
seamlessly integrated with iostreams, virtually eliminating the need for you to ever use 
strstream. 

The next chapter covers the iostream library. 

Language Support. Elements inherent to the language itself, like implementation limits in 
<cliniits> and <cfloat>; dynamic memory declarations in <new> like bad_al]oc (the 
exception thrown when you're out of memory) and sel_new_handler; the <typeinfo> header 
for RTTl and the <exception> header that declares the temiinate< ) and unexpected( ) 
functions. 

Diagnostics Library. Components C++ programs can use to detect and report errors. The 
<exception> header declares the standard exception classes and <casser(> declares the same 
thing as C's assert.h. 

General Utilities Library. These components are used by other parts of the Standard C++ 
library, but you can also use them in your own programs. Included are templatized versions of 
operators !=, >, <=, and >= (to prevent redundant definitions), a pair template class with a 
tuple-making template fiinction, a set of/Hnc(fon objects for support of the STL, and storage 
allocation functions for use with the STL so you can easily modify the storage allocation 
mechanism. 
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Localization Library. This allows you to localize strings in your program to adapt to usage 
in different counlries, including money, numbers, date, time, and so on. 

Containers Library. This includes the Standard Template Library (described in the next 
section of this appendix) and also the bits and bit_string classes in <bils> and <bitstring>, 
respectively. Both bits and bil_string are more complete implementations of the bitvector 
concept introduced in Chapter XX. The bits template creates a fixed-sized array of bits that 
can be manipulated with all the bitwise operators, as well as member functions like set( ), 
reset( ), count( ), length( ), test( ), any( ), and none( ). There are also conversion operators 
to_usliort( ), to_ulong( ), and to_string( ). 

The bit_string class is, by contrast, a dynamically sized array of bits, with similar operations 
to bits, but also with additional operations that make it act somewhat like a string. There's a 
fundamental difference in bit weighting: With bits, the right-most bit (bit zero) is the least 
significant bit, but with bit_string, the right-most bit is the most significant bit. There are no 
conversions between bits and bit_string. You'll use bits for a space -efficient sel of on-off 
flags and bit_string for manipulating arrays of binary values (like pixels). 

Iterators Library. Includes iterators that are tools for the STL (described in the next section 
of this appendix), streams, and stream buffers. 

Algorithms Library. These are the template functions that perform operations on the STL 
containers using iterators. The algorithms include: adjacent_find, prev_pemiutation, 
binary _search, push_heap, copy, random_shufi1e, copy_backward, remove, count, 
remove_copy, count_if, remove_copy_if, equal, remove_if, equal_range, replace, fill, 
replace_copy, fiil_n, replace_copy_if, find, replace_if, find_if, reverse, for_each, 
reverse_copy, generate, rotate, generate_n, rotate_copy, includes, search. 
inplace_merge, set_difference, lexicographical_compare, set_interseetion. lower_bound, 
set_symmetric_difference, make_heap, set_union, max. sort, max_element. sort_heap, 
merge, stable_partition, min, stable_sort, min_element. swap, mismatch. swap_ranges. 
next_permutation, transform, nth_element, unique. partial_sort. unique_copy, 
partial_sort_copy, upper_bound, and partition. 



Numerics Library. The goal of this library is to allow the compiler implementer to take 
advantage of the architecture of the underlying machine when used for numerical operations. 
This way, creators of higher level numerical libraries can write to the numerics library and 
produce efficient algorithms without having to customize to every possible machine. The 
numerics library also includes the complex number class (which appeared in the first version 
of C++ as an example, and has become an expected part of the library) in float, double, and 
long double forms. 
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1: Strings 



^One of the biggest time-wasters in C is character arrays: 
keeping track of the difference between static quoted strings 
and arrays created on the stack and the heap, and the fact 
that sometimes you're passing around a char* and 
sometimes you must copy the whole array. 

(This is the general problem of shallow copy vs. deep copy.) Especially because string 
manipulation is so common, character arrays are a great source of misunderstandings and 
bugs. 

Despite this, creating string classes remained a common exercise for beginning C-M- 
programmers for many years. The Standard C++ library string class solves the problem of 
character array manipulation once and for all, keeping track of memory even during 
assignments and copy-constructions. You simply don't need to think about it. 

This chapter examines the Standard C++ string class, beginning with a look at what 
constitutes a C++ string and how the C++ version differs from a traditional C character array. 
You'll learn about operations and manipulations using string objects, and see how C++ 
strings accommodate variation in character sets and string data ci 

Handling text is perhaps one of the oldest of all programming applic 
surprising that the C++ string draws heavily on the ideas and terminology that have long been 
used for this purpose in C and other languages. As you begin to acquaint yourself with C++ 
strings this fact should be reassuring, in the respect that no matter what programming idiom 
you choose, there are really only about three things you can do with a string: create or modify 
the sequence of characters stored in the string, detect the presence or absence of elements 
within the string, and translate between various schemes for representing string characters. 

You'll see how each of these jobs is accomplished using C++ string objects. 



What's in a string 



here are two significant differences between 
3]'igi]ially created by Najicy Nicolaisen 



C++ strings and their C progenitors. First, C++ string objects associate the array of 
characters which constitute the string with methods useful for managing and operating on it. 
A string also contains certain "housekeeping" information about the size and storage location 
of its data. Specifically, a C++string object knows its starting location in memory, its 
content, its length in characters, and the length in characters to which it can grow before the 
string object must resize its internal data buffer. This gives rise to the second big difference 
between C char arrays and C++ strings. C++ strings do not include a null terminator, nor do 
the C++ string handling member functions rely on the existence of a null terminator to 
perform their jobs. C++ strings greatly reduce the likelihood of making three of the most 
common and destructive C programming errors: overwriting array bounds, trying to access 
arrays through uninitialized or incorrectly valued pointers, and leaving pointers "dangling" 
after an array ceases to occupy the storage that was once allocated to il. 

The exact implementation of memory layout for the string class is not defined by the C++ 
Standard. This architecture is intended to be flexible enough to allow differing 
inq>le mentations by compiler vendors, yet guarantee predictable behavior for users. In 
particular, the exact conditions under which storage is allocated to hold data for a string object 
are not defined. String allocation rules were formulated to allow but not require a reference- 
counted in:q>le mentation, but whether or not the implementation uses reference counting, the 
semantics must be the same. To put this a bit differently, in C, every char array occupies a 
unique physical region of memory. In C++, individual string objects may or may not occupy 
unique physical regions of memory, but if reference counting is used to avoid storing 
duplicate copies of data, the individual objects must look and act as though they do 
exclusively own unique regions of storage. For example: 

// : COl [Strings tor age. cpp 
#include <string> 
#incliide <io3tream> 

using namespace std; 

int mainl) { 

string si ("12345") ; 

// Set the iterator indicate the first element 

string: [iterator it = sl.beginl); 

// This may copy the first to the second or 

// Either way, this statement may ONLY modify first 



'0'; 






Reference counting may serve to make an implementation more memory efficient, but it is 
transparent to users of the string class. 
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Creating and initializing C++ strings 

Creating and initializing strings is a straightforward proposition, and fairly flexible as well. In 
the example shown below, the first string, imBlank. is declared but contains no initial value. 
Unlike a C char array, which would contain a random and meaningless bit pattern until 
initialization, imBlank does contain meaningful information. This string object has been 
initialized to hold "no characters," and can properly report itsO length and absence of data 
elements through the use of class member functions. 

The next string, heyMom, is initialized by the literal argument "Where are my socks?". This 
form of initialization uses a quoted character array as a parameter to the string constructor. 
By contrast, stands rdReply is simply initialized with an assignment. The last string of the 
group, use ThisOneA gain, is initialized using an existing C++ string object. Put another way, 
this example illustrates that string objects let you: 

• C e te empty string and defer initializing it with character data 

• In al ze string by passing a literal, quoted character array as an argument to the 

• In Ize string using ■=' 

• Use o e string to initialize another 

// : COl : Small String.cpp 
#include <string> 
using namespace std; 

int mainl) { 

string imBlank; 

string heyMom ( "Where are my socks?"); 

string standardReply = "Beamed into deep " 

string useThi sOneAgain (standardReply) ; 
} III:- 

These are the simplest forms of string initialization, but there are other 
more flexibility and control. You can : 

• Use a portion of either a C char array or a C++ string 

• Combine different sources of initialization data using operator+ 

• Use the string object' s substr( ) member fiinction to create a substring 

// : COl : Small String2 . cpp 
#incli]de <string> 
#include <iostream> 
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t << quoteMe << endl ; 
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The string member function substr( ) takes a starting position as its first argument and the 
number of characters to select as the second argument. Both of these arguments have default 
values and if you say substr( ) with an empty argument list you produce a copy of the entire 
string, so this is a convenient way to duplicate a string. 



Here's what the string quoteMe 



after the u 



n shown above : 

: in a UFO.? 



Notice the final line of example above. C-H- allows string initializatii 
mixed in a single statement, a flexible and convenient feature. Also n 
initializer copies just one character from the source string. 

Another slightly more subtle initialization technique involves the use 
string.begin( ) and string.eiid( ). This treats a string like a containe 
seen primarily in the form of vector so far in this book - you 
soon) which has iterators indicating the start and end of the ' 
hand a string constructor two iterators and it will copy from 
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is of the string it 
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the other into the new 



//: C01:St 



s.cpp 
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^include 
#include 
using nai 



ainl) ! 
ing sour 



s are not restricted to begin( ) and end( ), so you can choose a subset of characters 
from the source string. 

Initialization limitations 

C 4+ strings may no(be initialized with single characters or with ASCII or other integer 
values. 

// : COl :UhOh.cpp 
#include <string> 
using namespace std; 

int mainl) { 

// Error: no single char inits 

//! string nothingDoingl I'a' ) ; 

// Error: no integer inits 

//! string nothingOoing2 ( 0x3 7 ) ; 
} III:- 



This is true both for initialization by assignment and by copy 

Operating on strings 



li 1 1 tig g I lg [ i titii j , ii i it li ii j , [i nil i ; ig j , \\i it ^ )' ii ; char arrays. However, there are 
two unfortunate aspects of the Standard C library functions for handling char arrays. First, 
there are three loosely organized families of them: the "plain" group, the group that 
manipulates the characters without respect to case, and the ones which require you to supply a 
count of the number of characters to be considered in the operation at hand. The roster of 
function names in the C char array handling library literally runs to several pages, and though 
the kind and number of arguments to the functions are somewhat consistent within each of the 
three groups, to use them properly you must be very attentive to details of function naming 
and parameter passing. 
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The second inherent trap of the standard C char array tools is that they all rely explicitly on 
the assumption that the character array includes a null terminator. If by oversight or error the 
null is omitted or overwritten, there's very little to keep the C char array handling functions 
from manipulatmg the memory beyond the limits of the allocated space, sometimes with 
.s results. 



C++ provides a vast improvement in the convenience and safety of string objects. For 
purposes of actual string handling operations, there are a modest two or three dozen jnember 
function names. It's worth your while to become acquainted with these. Each function is 
overloaded, so you don't have to learn a new string member function name simply because of 
small differences in their parameters. 

Appending, inserting and concatenating 
strings 

willioiii inletvintion on lit pirtoltliE prcgitmiDet. Nol only does lliis nike ilrinj hiDdliD; 
code inlieteiilly lotc Irmtwotllijjtilso ilioilenlitelj eliiiniles i ledinns "boDseUeping" 
cbore - keepioe tFick of ibe bon n d i o I Ibe sloriee in liki y od t Utinei liv r Fo r e n i ph, il 
you ere lie i sir in e object in d initiilize il w ilb i slrioe of 3 [I copies of 'X ', ind liter stare in il 
n copies of 'Zotr ie% Ibe objecl ilsell i ill reillocile snfliciei! sloriee to icco m m od ite llie 
;[o« !b el tbe d iti. F etb ip s now bere is ill is properly i ore ippreciiled tbn i hen I lie siring s 

A ppendinj, cojciKnilins, ind iisertiiiE sltiii js o flen g iv e rise to tb is circn m stin c e, b j 1 tli ( 
string 1 en ber foclions append( ) and insert( ) transparently reallocate storage when a string 
grows. 

// : COl :StrSize.cpp 
#include <3tring> 
#incliide <io3tream> 

int mainl) { 

string bigNewsC'I saw Elvis in a UFO. "); 

cout << bigNews << endl; 

// How much data have we actually got? 

cout << "Size = " << bigNews . size << endl; 

// How much can we store without reallocating 

cout « "Capacity = " 

<< bigNews .capacity << endl; 
// Insert this string in bigNews immediately 
// before bigNews [1] 
bigNews . insert (1, " thought I "); 
cout << bigNews << endl; 
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cout << "Size = " << bigNews . size 1 ) << endl ; 
aout « "Capacity = " 

<< bigNews . capacity ( ) << endl; 
// Make sure that there will be this much space 
bigNews. reserve (500) ; 

// Add this to the end of the string 
bigNews .append("I 've been working too hard."); 
cout << bigNews << endl; 

cout << "Size = " << bigNews . size 1 ) << endl; 
cout « "Capacity = " 

<< bigNews .capacity 1 ) << endl; 
} ///:- 
Here is the output: 

saw Elvis in a UFO. 



Capacity = 511 

This example demonstrates that even though you can safely relinquish much of the 
responsibility for allocating and managing the memory your strings occupy, C++ strings 
provide you with several tools to monitor and manage their size. The size( ), resize( ), 
capacity( ), and reser¥e{ ) member functions can be very useful when its necessary to work 
back and forth between data contained in C++ style strings and traditional null terminated C 
char arrays. Note the ease with which we changed the size of the storage allocated to the 

The exact fashion m which the string member functions will allocate space for your data is 
dependent on the implementation of the library. When one implementation was tested with 
the example above, it appeared that reallocations occurred on even word boundaries, with one 
byte held back. The architects of the string class have endeavored to make it possible to mix 
the use of C char arrays and C++ string objects, so it is likely that figures reported by 
StrSize.cpp for capacity reflect that in this particular implementation, a byte is set aside to 
easily accommodate the insertion of a null terminator. 
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Replacing string characters 



insert( ) is particularly nice because it absolves you of making sure the insertion of char 
in a string won't overrun the storage space or overwrite the characters immediately following 
the insertion point. Space grows and existing characters pohtely move over to accommodate 
the new elements. Sometimes, however, this might not be what you want to happen. If the 
data in string needs to retain the ordering of the original characters relative to one another or 
must be a specific constant size, use the replace( ) function to overwrite a particular sequence 
of characters with another group of characters. There are quite a number of overloaded 
versions of replace( ), but the simplest one takes three arguments: an integer telling where to 
start in the string, an integer telling how many characters to eliminate from the original string, 
and the replacement string (which can be a different number of characters than the eliminated 
quantity). Here's a very simple example: 

// : COl :StringReplace.cpp 

// Simple find-and-replace in strings 

linclude <string> 

#include <iostream> 



ing . 






td; 



g 3 ("A piece of 
g tag ("$tagS") ; 



The tag is first inserted into 
insert point, and that an exti 

You should actually check I 
The above example replace 
with a siring. Here's a mori 

// : COl [Replace 
#include <strini 
#include <iostr( 



s (notice that the insert happens before the value indicating the 
a space was added after tag), then it is found and replaced. 

3 see if you've found anything before you perform a replace( ). 
; with a char*, but there's an overloaded version that replaces 
i complete demonstration replace( ) 



id replaceCha 
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string findMe, string newChars){ 

// Look in modifyMe for the "find string" 

int i = modifyMe. find (findMe, 0); 

// Did we find the string to replace? 

if (i != string::npo3) 

// Replace the find string with newChars 
modifyMe. replace (i, newChars . size ( } , newCha 



int mainO | 

"I thought I saw Elvis in a UFO. " 
"I have been working too hard."; 
string replacement ( "wig" ) ; 
string f indMe ( "UFO" ) ; 

// Find "UFO" in bigNews and overwrite i 
replaceChars (bigNews, findMe, replaceme 
cout << bigNews << endl ; 
} ///:- 
Now the last line of output from replace.cpp looks like this: 

I I thought I saw Elvis in a wig. I have bee 
working too hard. 

If replace doesn't find the search string, it returns npos. npos is a st 
the bask_string class. 

Unlike iiisert( ), replace( ) won't grow the string's storage space if you copy new characters 
into the middle of an existing series of array elements. However, it will grow the storage 
space if you make a "replacement" that writes beyond the end of an existing array. Here's an 
example: 

// : COl :ReplaceAndGrow.cpp 
#include <string> 
#include <iostream> 
using namespace std; 

int mainl) { 

string bigNews ("I saw Elvis in a UFO. " 

"I have been working too hard."); 
string replacement ( "wig" ) ; 
// The first arg says "replace chars 
// beyond the end of the existing string": 
bigNews .replace (bigNews . size ( ) , 
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The call to replace () begins "replacing" beyond Ihe end of the existing array. The output 
looks like this: 

I I saw Elvis in a UFO. I have 
been working too hard. wig 

Notice that replace() expands the array to accommodate the growth of the stiing due to 
"replacement" beyond the bounds of the existing array. 

Simple character replacement using the STL 
replace( ) algorithm 

You may have been hunting fh rough this chapfer trying to do some thing relatively simple lik 
replace all the instances of one character with a different character. U pon find log the above 
section on replacing, you thought you found the answer but then you started seeing groups o 
characters and counts and other things that looked a bit too complex. Doesn't string have a 
way to just replace one character with another everywhere? 

The string class by itself doesn't solve all possible problems. The remainder are relegated t< 
the STL algorithms, because the string class can look just like an STL container (the STL 
algorithms work with anything that looks like an STL container). All the STL algorithms 
work on a "range" of elements within a container. Usually that range is just "from the 
beginning of the container to the end." A string object looks like acontainer of characters: I 
get the beginning of the range you use string::begin( ) and to get the end of the range you u 
string::end(). The following example shows the use of the STL replace( ) algorithm to 
replace all the instances of 'X' with 'Y': 

// : GDI :StringCharRepl ace. cpp 
#include <string> 
#include <algorithm> 
#include <io3tream> 

using namespace std; 



string s {" sssXsssXXsb.XXXb.XXXXb.b.b." ) ; 
cout « s « endl; 

replace (s. begin () , s . end ( ) , 'X', 'Y' ) ; 
cout « s « endl; 
} ///:- 

Notice that this replace( ) is ho/ called as a member function of string. Also, unlike the 
string: :replace( ) functions which only perform one replacement, the STL replace is 
replacing all instances of one character with another. 
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The STL rep)ace( ) algorithm only works with single objects (in this case, char objects}, and 
will not perform replacements of quoted char arrays or of string objects. 



Since a string looks like an STL container, there are a number of otiier STL algorithms that 
can be apphed to it, whicli may solve other problems you have that are not directly addressed 
by the string member functions. See Chapter XX for more information on the STL 
algorithms. 

Concatenation using non-member 
overloaded operators 

One of tlie id o si de lislitlii I diuovtrits i¥iiliii< i C proetin o er leimin; iboyt C i+ string 
handhng is how simply strings can be combined and appended using operator+ and 
operator-i=. These operators make combining strings syntactically equivalent to adding 

// : COl :AddStriiigs . cpp 
#include <string> 
#include <iostream> 



int mainl) { 

string s3("The other "); 

cout « si « endl; 

// Another way to concatena 

cout « si « endl; 
// You can index the string 
si += s3 + s3[4] + "oh lala 
cout « si « endl; 
} ///:- 
The output looks like this: 
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operator+ and operator+= are a very flexible and convenient means of combining string 
data. On the right hand side of the statement, you can use almost any type that evaluates to a 
group of one or more characters. 

Searching in strings 

I 1 ( find family of string member functions allows you to locate a character or group of 
characters within a given string. Here are the members of the And family and their general 



string find member function 


What/how it finds 


fiDd( ) 


Searches a string for a specified character or 
group of characters and returns the starting 
position of the first occurrence found or npos 
if no match is found, (npos is a const of -1 
and mdicates that a search failed.) 


find_flrst_of( ) 


Searches a target string and returns the 
position of the first match of any character in 
a specified group. If no match is found, it 
returns npos. 


find_]ast_of( ) 


Searches a target string and returns the 
position of the last match of any character in 
a specified group. If no match is found, it 
returns npos. 


find_first_not_of( > 


Searches a target string and returns the 
position of the first element that doesn 't 
match any character in a specified group. If 
no such element is found, it returns npos. 


find_]ast_not_of( ) 


Searches a target string and returns the 
position of the element with the largest 
subscript that doesit 't match of any character 
in a specified group. If no such element is 
found, it returns npos. 


rfindO 


Searches a strmg from end to begmning for a 
specified character or group of characters and 
returns the starting position of the match if 
one is found. If no match is found, it returns 
npos. 



String searcliing member functions and their general uses 
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The simplest use of flnil< ) searches for one or more characters in a string. This overloaded 
version of flnd( ) takes a parameter that specifies the character(s) for which to search, and 
optionally one that tells it where in the string to begin searching for the occurrence of a 
substring. (The default position at which to begin searching is 0.) By setting the call to find 
inside a loop, you can easily move through a string, repeating a search in order to find all of 
the occurrences of a given character or group of characters within the string. 

Notice that we define the string object sieveChars using a constructor idiom which sets the 
initial size of the character array and writes the value 'P' to each of its member. 

// : COl :Sieve.cpp 
#include <string> 

#include <iostream> 
using namespace std; 

// Create a 50 char string and set each 

// element to 'P' for Prime 

string sieveChars ( 50 , ' P ' ) ; 

// By definition neither nor 1 is prime. 

// Change these elements to "N" for Not Prime 

sieveChars .replace (0, 2, "NN" ) ; 

// Walk through the array: 

for(int i = 2; 

i <= (sieveChars. sizel) / 2} - 1; i++) 

// Find all the factors: 

for lint factor = 2 ; 

factor * i < sieveChar s . s i ze ( ) ; f actor++ ) 
sieveChars [factor * i] = 'N'; 

cout « "Prime:" « endl; 

// Return the index of the first 'P' element: 

int j = sieveChars. findCP' ) ; 

// While not at the end of the string: 

while(j != sieveChars .npos) { 

// If the element is P, the index is a prime 



// Find the next prime 

j = SieveChars. findl 'P', j); 



cout << "\n Not prime:" << endl ; 
// Find the first element value n 
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j = sieveChars .f ind_f irst_not_of 1 'P' ) ; 
whilelj != sieveChars .npos) { 



The output from Sieve.cpp looks like this: 

Prime: 

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 

Not prime: 

1 4 6 8 9 10 12 14 15 16 18 20 21 22 

24 25 26 27 28 30 32 33 34 35 36 38 39 

40 42 44 45 46 48 49 
find( ) allows you to walk forward through a string, detecting multiple i 
character or group of characters, while find_flrst_not_of( ) allows you to test for the absence 
of a character or group. 

The And member is also useful for detecting the occurrence of a sequence of characters in a 

// : COl :Find.cpp 

// Find a group of characters in a string 

#include <3tring> 

#include <iostream> 

using namespace std; 

int mainl) { 

string chooseOne ( "Eenie, meenie, miney, mo" ) ; 
int i = chooseOne.f ind ("een" ) ; 
while (i != string : :npos ) { 



///:- 
Find.cpp produces a single line of output : 



This tells us that the first 'e' of the search group "een" was found in the word "meenie," and 
is the eighth element in the string. Notice that find passed over the "Een" group of characters 
in the word "Eenie". The find member function performs a case sensitive search. 
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There are no functions in the string class to change the case of a string, but these functions 
can be easily created using the Standard C library functions toupper< ) and tolower( ), which 
change the case of one character at a time. A few small changes will make Find.cpp perform 
e search: 



// : COl :NewFind.cpp 
#include <3tring> 
iinclude <iostream> 

using namespace std; 

// Make an uppercase copy of s: 
string uppercase (strings s) { 

char* buf = new char [ s . length ()] ; 

s.copy(buf, s.lengthO); 

for(int i = 0; i < s.lengthl); i++ 
buf [i] = toupper (buf [i] ) ; 

string r (buf , s.lengthl)); 

delete buf; 



// Make a lowercase copy of s: 
string lowerCase (strings s) { 

char' buf = new char [ s . length ()] ; 

s. copy (buf, S.lengthO); 

for(int i = 0; i < s.lengthl); i+ 
buf [i] = tolower (buf [i] ); 

string r (buf , s.lengthl)); 

delete buf; 



nt mainl) { 
string chooseOne ( "Eenie, meenie, miney 
cout << chooseOne << endl; 
cout << uppercase (chooseOne) << endl; 
cout << lowercase (chooseOne) << endl; 

int i = chooseOne. findC'een") ; 
while (i != string : :npos ) { 
cout « i « endl; 

i = chooseOne.f ind l"een", i); 
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// Search lowercase 
string lease = lowe 



e (chooseOne) ; 



i = lease. findC'een"); 
whiled != lease. npos) { 



= Ic 



e.findl". 



// Search uppercase: 
string ucase = uppercase ( 
cout « ucase « endl ; 
i = ucase. findC'EEN") ; 
while (i != ucase. npos) { 
cout « i « endl; 



e.f ind ( "EEN" , 



///:- 



Bolh the upperCase( ) and lowerCase{ ) functions follow Ihe same form: they allocate 
storage to hold the data in the argument string, copy the data and change the case. Then they 
create a new string with the new data, release the buffer and return the result string. The 
c_str( ) function cannot be used to produce a pointer to directly manipulate the data in the 
string because c_str( ) returns a pointer to const. That is, you're not allowed to manipulate 
string data with a pointer, only with member functions. If you need to use the more primitive 
char array manipulation, you should use the technique shown above. 

The output looks like this: 



EENIE, MEENIE, MINEY, MO 



EENIE, MEENIE, MINEY, MO 



searches found both 



NewFind.cpp isn't the best soluti 
e string comparisons. 



Chapter 14: Templates & Conta 



Finding in reverse 



Sometimes it's necessary to search through a string from end to beginning, if you need to find 
the data in "last in / first out "order. The string member function rflnd( ) handles this job. 



/ 


: CQl:Rparse.cpp 




// Reverse the order of words 


n a str 


# 


Lnclude 


<string> 




# 


Lnclude 


<iostream> 




# 


Lnclude 


<vector> 




using nar 


nespace std; 




i 


It main 


) { 






// The 


';' characters will be delimi 




string 


s ("now., -sense; make; to 


going;! 




cout « s « endl; 






// To 


tore the words: 






vector<string> strings ; 






// The 


last element of the string: 




int la 


t = s.sizeO; 






// The 


beginning of the curr 


nt word 




int current = s . rf ind ( ' ; ' ) ; 






// Walk backward through the 


string: 




while ( 


urrent != string: :npo 


) 1 




// P 


sh each word into the 


vector. 




// c 


rrent is incremented before c 




// a 


/old copying the delim 


ter. 




stri 


gs .push_back( 






s. 


ubstr ( ++current, last 


curren 




// Back over the delimiter 


we just 




// a 


d set last to the end 


of the 




curr 


nt -= 2; 





// Find the next delimiter 
current = s.rfindl';', curr. 
1 

// Pick up the first word - i 
// preceded by a delimiter 

// Print them in the new orde 

fordnt j = 0; j < strings. si 

cout « strings [j] « " " ; 

} III:- 
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Here's how the output from Rparse.cpp looks: 

I now. ; sense/make; to; going; is; Thi 



rfind( ) backs through the string looking for tokens, reporting the array index of matching 
characters or string: :npos if it is unsuccessful. 



Finding first/last of a set 



T 1 1 fliid_first_of( ) and flnd_last_of( ) member functions can be conveniently put to work to 
create a little utility thai will strip whitespace characters off of both ends of a string. Notice it 
doesn't touch tlie original string, but instead returns a new string: 



/ 


: COl 


trim.h 








#1 


fndef 


TRIM_H 








#define 


TRIM_H 








#1 


nclude <string> 






/ 


Genei 


al tool to 


strip space 


ir 


line £ 


td: :stri 


g 


tri 


n (const 




if (3.1 


ength () 




0) 






retL 


rn s; 










int b 


= s.find_ 


_fi 


rst 


_not_of 1 




int e 


= s.find 


_la 


st_ 


ot_of 1" 




if lb = 


= -1) // 


Nc 


no 


-spaces 



#endif // TRIM_H ///:- 

The first test checks for an empty string; in that case no tests are made and a copy is returned. 
Notice that once the end points are found, the string constructor is used to build a new string 
from the old one, giving the starting count and the length. This form also utilizes the "return 
value optimization" (see the index for more details). 

Testing such a general-purpose tool needs to be thorough: 

//: 



C01:TrimTest 


cp 


lude "trim.h 




lude <iostre 


am> 


g namespace 


td 


ng s[] = { 




\t abcdefghi 


kl 


bcdefghijklm 


lop 


\t abcdefghi 


kl 
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"ab", "abc" 


"a b c". 


a b c \t ", 


" \t a \t b \t 


/ Must also 


test the empty 



for lint i = 0; i < sizeof s / sizeof *s; i + +) 
test ls[i] ) ; 
} ///:- 

In the array of string s, you can see that the character arrays are automatically converted to 
string objects. This array provides cases to check the removal of spaces and tabs from both 
ends, as well as ensuring that spaces and tabs do not get removed from the middle of a string. 

Removing characters from strings 

M f woid |i[occssdi/[iiee liyoi I p r« 'nig (M IcrosofI W c rd | ¥ ill si f e i d ocii id en 1 In HTM L, 
bnl il doein'l [ccoeniic Ibiltlic code linings in tills ba at should be lined w Itb Die HTM L 
'preftriilKdMiE (< F 8 E > ), lad ll puts p irj{rip i « iris (< P > jnd < iP > } Jion »d e>erj llsthj 
line. Ibis in nns ibil ill ibe ind enlitio n In Ihe cade llsllnp Is lost. In iddlllan, W ord si<es 
HTM L «' nil reduced lout sizes lor body teil, «liicli i ikes it bird to leid. 

To (ODverltlie boot to HTM L form ^, then, the original output must be reprocessed, watching 
for the tags that mark the start and end of code listings, insertmg the <PRE> and </PRE> tags 
at the appropriate places, removmg all the <P> and </?> tags within the listings, and adjusting 
the font sizes. Removal is accomplished with the erase( ) member function, but you must 
correctly determine the starting and ending points of the substring you wish to erase. Here's 
the program that reprocesses the generated HTML file: 

// : COl :ReprocessHTML.cpp 

// Take Word's html output and fix up 

// the code listings and html tags 

#include ".. /require . h" 

#include <iostream> 

#include <fstream> 

#include <string> 

using namespace std; 



I subseqiienlly found beller tools to accomplish this task, but the prograi' 
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// Produce a new string which is the original 
// string with the html paragraph break marks 
// stripped off: 

string stripPBreaks (string s) { 
int br; 
while ( (br = s . f ind ( "<P>" ) ) != str ing : : npos ) 

while((br = s.find("</P>") > != str ing :: npos ) 



// After the beginning of a code listing is 
// detected, this function cleans up the listin 
// until the end marker is found. The first lin 
// of the listing is passed in by the caller, 
// which detects the start marker in the line. 
void fixupCodeListing (istreamS in, 

out « line.substr(0, tag) 

<< "<PRE>" // Means "pref ormatted" in html 
<< StripPBreaks (line. substr (tag) ) << endl; 

while (getline (in, s) ) { 

int endtag = s . find ("/""/""/"": ~" ) ; 
iflendtag != string :: npos ) { 

string before = s.substrlO, endtag); 
string after = s . substr (endtag ) ; 
out << StripPBreaks (before) << "</PRE>" 
« after « endl; 



ring removals [] = { 

"<FONT SIZE=2>", 

"<FONT SIZE=1>", 

"<FONT FACE=\"Times\" SIZE=1> 

"<FONT FACE=\"Times\" SIZE=2> 
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"<FONT FACE=\"Courier\" SIZE=1> 
"SIZE=1", // Eliminate all othe 
"SIZE=2", 



sizeof (removals) /sizeofl*removals) ; 

int main(int argc, char^ argv [ ] ) { 
requireArgs (argc, 2) ; 
ifstream in (argv [ 1 ] ) ; 
assure (in, argv[l]); 
ofstream out (argv[2] ) ; 
string line; 
while (getline (in, line)) | 

// The "Body" tag only appears once: 
if lline.find("<BODY") ! = string: :npos) { 
out << "<BODY BGCOLOR=\"#FEFFFF\" " 
"TEXT=\"#O0OOOO\">" « endl; 
continue; // Get next line 
1 

// Eliminate each of the removals strings: 
for lint i = 0; i < rmsz; i + +) { 

int find = line.find(removals[i] ) ; 
if (find != string: :npos) 

line. erase (find, removals [i] .sizel) ) ; 
1 

int tagl = 1 ine . find ("/""/"":"> ; 
int tag2 = line . find ("/""*"":") ; 
if (tagl != string: :npos) 

fixupCodeListing (in, out, line, tagl) ; 
else if(tag2 != str ing : : npos ) 

fixupCodeListing (in, out, line, tag2 ) ; 

out « line « endl; 
1 
( III-." 

Notice the lines that detect the start and end listing tags by indicating them with each 
character in quotes. These tags are treated in a special way by the logic in the 
Extractcode.cpp tool for extracting code listings. To present the code for the tool in the text 
of the book, the tag sequence itself must not occur in the listing. This was accomplished by 
taking advantage of a C++ preprocessor feature that causes text strings delimited by adjacent 
pairs of double quotes to be merged into a single string during the preprocessor pass of the 
build. 
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int tagl = 1 ine . find (■'/■'■'/"■':") ; 
The effect of the sequence of char arrays is to produce the starting tag for code listings 

Stripping HTML tags 

Somelimes it's useful to take an HTML file mi strip ils tags [;d you have so m etiiing 
approiimaliDg the text Ih at w ou Id be displayed in the Web browser, only as an ASCII ti 
file. The slriDg class once again comes in handy. The following has s< 
theme of the previous example: 

// : COl :HTMLStripper .cpp 

// Filter to remove html tags and markers 

#include ". ./require. h" 

#include <f3tream> 

#include <iostream> 

#include <3tring> 

using namespace std; 

string replaceAll (string s, string f, strinc 
unsigned int found = s.findlf); 
while(found != str ing : : npos ) { 

s. replace (found, f.lengthO, r ) ; 

found = s.find(f ) ; 



tring stripHTMLTags ( str ing s) { 
while (true) ( 

unsigned int left = s.find('<'); 

unsigned int right = s.find('>'); 

if (left = = string: :npos | | r ight = = str ing : :npos) 

break; 
s = s. erase (left, right - left + 1); 
1 

s = replaceAll (s, "Samp;", "S"); 

// Etc... 
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requireArgs (argc, 1, 

"usage: HTMLStripper InputFile") ; 
ifstream in(argv[l]); 
assure (in, argv[l]); 
const int sz = 4096; 
char buf [sz] ; 
while(in.getlinelbuf, sz)) { 

string s (buf) ; 

cout << stripHTMLTags (s) << endl ; 
1 
( ///:- 

The string class can replace one string with another but there' s no facility for replacing all the 
strings of one type with another, so the replaceAll( ) function does this for you, inside a while 
loop that keeps finding the next mstance of the find string f. That function is used inside 
stripHTMLTags after it uses erase( ) to remove everything that appears inside angle braces 
('<' and ■>'). Note that I probably haven't gotten all the necessary replacement values, but 
you can see what to do (you might even put all the find-replace pairs in a table,,,). In niain( ) 
the arguments are checked, and the file is read and converted. It is sent to standard output so 
you must redirect it with '>' if you want to write it to a file. 



Comparing strings 



C 01 piiiie iiiiiji is inlitieDili tliffertnt 111 in co m p nine id m b ers. F( ii m b crs li i v c cc n sli nt, 
mlveiiilly in em In flu I v iln ci. T o eviinak (it relitio i ibip bclween liie m i;iil<iidc oMw o 
slrings, y on insliike i k,(icaf compari,ton. Lexical comparison means that when you test a 
character to see if it is "greater than" or "less than" another character, you are actually 
comparing the numeric representation of those characters as specified in the coUatmg 
sequence of the character set being used. Most often, this will be the ASCII collating 
sequence, which assigns the printable characters for the English language numbers in the 
range from 32 to 127 decimal. In the ASCII collating sequence, the first "character" in the list 
is the space, followed by several common punctuation marks, and then uppercase and 
lowercase letters. With respect to the alphabet, this means that the letters nearer the fix)nt have 
lower ASCII values than those nearer the end. With these details in mind, it becomes easier to 
remember that when a lexical comparison that reports si is "greater than" s2, it simply means 
that when the two were compared, the first differmg character in si came later in the alphabet 
than the character in that same position in s2. 

C++ provides several ways to compare strings, and each has their advantages. The simplest to 
use are the non member overloaded operator functions operator ==, operator != operator >, 
operator <, operator >=, and operator <=. 

I //: COl :CompStr .cpp 

#include <string> 
#include <iostream> 
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nt m.lnl) { 
// Strings to compare 

// See if the string elements are the same: 

if (sl[i] == s2[i] ) 

cout « sl(i] « " " « i « endl; 
// Use the string inequality operators 
if(sl != s2) { 

if (si > s2) 

1 

Here's the output from CompStr.cpp: 

T 

h 1 

4 

The overloaded comparison operators are useful for comparing both full strings and 
individual string elements. 

Notice in the code fragment below the flexibility of argument types on both the left and right 
hand side of the comparison operators. The overloaded operator set allows the direct 
comparison of string objects, quoted literals, and pointers to C style strings. 

// The lvalue is a quoted literal and 
// the rvalue is a string 
if ("That " == s2) 

cout « "A match" « endl ; 
// The lvalue is a string and the rvalue is a 
// pointer to a c style null terminated string 



You won't find the logical not (!) or the logical comparison operators (&& and II) among 
operators for string. (Neither will you find overloaded versions of the bitwise C operators i 



Chapter 14: Templates & Conta 



", or ■-.) The overloaded non member comparison operators for the string class are limited to 
the subset which has clear, unambiguous application to single characters or groups of 
characters. 

The conipare( ) member function offers you a great deal more sophisticated and precise 
comparison than the non member operator set, because it returns a lexical comparison value, 
and provides for comparisons that consider subsets of the siring data. It provides overloaded 
versions that allow you to compare two complete strings, part of either string to a complete 
string, and subsets of two strings. This example compares complete strings: 

// : COl : Compare. cpp 

// Demonstrates compare () , swap() 

#include <string> 

#include <iostream> 

using namespace std; 

int mainl) { 

string second ( "That" ) ; 
// Which is lexically greater? 
switch (first. compare (second) ) | 
case 0: // The same 

cout « first « " and " « second « 

" are lexically equal" « endl; 
break; 

first. swap (second) ; 
// Fall through this case... 
case 1: // Greater than 
cout « first « 

" is lexically greater than " « 

second « endl ; 
1 
1 III:- 

The output from Compare.cpp looks like this: 

I This is lexically greater than That 

To compare a subset of the characters in one or both strings, you add arguments that define 
where to start the comparison and how many characters to consider. For example, we can use 
the overloaded version of compare( ): 

sl.compare(slStartPos, slNumberChars, s2, slStartPos, s2N umber Chars); 

If we substitute the above version of conipare( ) in the previous program so that it only looks 
at the first two characters of each string, the program becomes: 
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// : COl :Compare2 . cpp 
// Overloaded compare ( ) 
linclude <string> 
linclude <io3tream> 

using namespace std; 

int mainl) { 

string second ( "That" ) ; 

// Compare first two characters of each string: 
switch (first. compare (0, 2, second, 0, 2) ) { 
case 0: // The same 

cout « first « " and " « second « 

" are lexically equal" « endl; 
break; 

first. swap (second) ; 
11 Fall through this case... 
case 1: // Greater than 
cout « first « 

" is lexically greater than " « 
second « endl ; 
1 
1 III:- 

The output is: 

I This and That are lexically equal 
which is true, for the first two characters of "This" and "That." 

Indexing with [ ] vs. at( ) 

In the esarapks so far, we have used C style array indexing synlas to refer to an individual 
character in a string, C4t strings provide an a Item alive to the s[n] notation: the at() member. 
These two idioms produce the same result in C++ if all goes well: 

// : COl :StringIndexing.cpp 
linclude <string> 
linclude <iostream> 
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The output from this code looks like this: 

I 2 2 

However, there is one important difference between [ ] and at( ). When you try to reference 
an array element that is out of bounds, at() will do you the kindness of throwing an 
exception, while ordinary [ ] subscripting syntax will leave you to your own devices: 

// : COl :BadStringIndexing.cpp 
linclude <string> 
linclude <iostream> 



string s ("1234") ; 

// Runtime problem: goes beyond array bounds: 

cout « s[5] « endl; 

// Saves you by throwing an exception: 
cout « s.at (5) « endl; 
} ///:- 

Using at( ) in place of [ ] will give you a chance to gracefully recover from references to array 
elements that don't exist. at( ) throws an object of class out_of_range. By catching this object 
in an exception handler, you can take appropriate remedial actions such as recalculating the 
offending subscript or growing the array. (You can read more about Exception Handling in 
Chapter XX) 



Using iterators 



h lit (iiffl pie pjogrini NewFind.cpp, we used a lot of messy and rather tedious C char 
array handling code to change the case of the characters in a string and then search for the 
occurrence of matches to a substring. Sometimes the "quick and dirty" method is justifiable, 
but in general, you won't want to sacrifice the advantages of having your siring data safely 
and securely encapsulated in the C++ object where it lives. 



Here is a better, safer way to handle case insensitive comparison of two C++ string objects. 
Because no data is copied out of the objects and into C style strings, you don't have to use 
pointers and you don't have to riskoverwriting the bounds of an ordinary character array. In 
this example, we use the string iterator. Iterators are themselves objects which move through 
a collection or container of other objects, selecting them one at a time, but never providing 
direct access to the implementation of the container. Iterators are no( pointers, but they are 
useful for many of the same jobs. 

I // : GOl :CmpIter .cpp 
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JUnclude 
using nai 



compare funct 



tringCmpi (const strings si, const strings s2) { 
// Select the first element of each string: 
string: : const_iterator 

pi = sl.beginO, p2 = s2.beginl); 
// Don't run past the end: 
while Ipl != sl.endO SS p2 != s2.endl)) { 

// Compare upper-cased chars: 

if (toupper(*pl) != toupper ( '^p2 ) ) 

// Report which was lexically greater: 
return ( toupper ( *pl ) <toupper ( *p2 ) ) ? -1 : 1 ; 

pl + +; 

p2 + +; 
1 

// If they match up to the detected eos, say 
// which was longer. Return if the same. 



int mainl) { 

string si ("Mozart") ; 

string s2 ( "Modigl iani " ) ; 

cout << stringCmpi (si, s2) << endl ; 
} ///:- 

Notice that the iterators pi and p2 use the same syntax as C pointers - the '*" operator makes 
the value o/element at the location given by the iterators available to the toupper( ) function. 
toupper( ) doesn't actually change the content of the element in the string. In fact, it can't. 
This definition of pi tells us that we can only use the elements pi points to a: 



ng: :const_iterator pi = sl.beginO; 



The way toBpper( ) and the iterators are used in this example is called a case preserving ca; 
insensitive comparison. This means that the string didn't have to be copied or rewritten to 
accommodate case insensitive comparison. Both of the strings retain their original data, 
unmodified. 

Iterating in reverse 

]u si as the stand ard C pointer gives us the increment (4 + 1 and decrement (--I operators to 



Chapter 14: Templates & Conta 



varieties. You've seen end( ) and begin( ), which are the tools for moving forward through a 
string one element at a time. The reverse iterators reiid( ) and rbegiii( ) allow you to step 
backwards through a string. Here's how they work: 

// : COl :RevStr .cpp 

// Print a string in reverse 

#include <string> 

#include <iostream> 

using namespace std; 

int mainO { 

string s ( " 98 7 65432 1 " ) ; 

// Use this iterator to walk backwards: 

string: :reverse_iterator rev; 

// "Incrementing" the reverse iterator moves 

// it to successively lower string elements: 

for(rev = s.rbegin(); rev != s . rend () ; rev++) 

} III:- 
The output from RevStr.cpp looks like this: 

I 123456789 

Reverse iterators act like pointers to elements of the string's character array, except that when 
you apply the increment operator to them, they move backward rather than forward. rbegiD( ) 
and reiid( ) supply string locations that are consistent with this behavior, to wit, rbegiD( ) 
locates the position just beyond the end of the string, and rend( ) locates the beginning. Aside 
irom this, the main thing to remember about reverse iterators is that they aren 7 type 
equivalent to ordinary iterators. For example, if a member function parameter list includes an 
iterator as an argument, you can't substitute a reverse iterator to get the function to perforin 
it's job walking backward through the string. Here's an illustration: 

I // The compiler won't accept this 

string sBackwards (s . rbegin ( ) , s . rend ( ) ) ; 



The string constructor won't accept reverse iterators in place of forward iteratois in its 
parameter list. This is also true of string members such as copy( ), insert( ), and assign( ), 

Strings and character traits 

W t sttit to bivt » crked o n r w i )- ircoil tit m irg in s c I cisc ioscuitivi striog ic [« p iriso g s 
DSJDg Cm strlDg objects, so isi)\it it's tin ( to is I: tlie o b v ic n q n estio n : ' W li ) isn't cis t- 
inien siliv e t o m f t riso n p irt o f tlie stiD d ird string class ?" The answer provides interesting 
background on the true nature of C++ string objects. 

Consider what it means for a character to have "case." Written Hebrew, Farsi, and Kanji don't 
use the concept of upper and lower case, so for those languages this idea has no meaning at 
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all. This the first impediment to built-in C++ support for case-insensitive character search and 
comparison: the idea of case sensitivity is not universal, and therefore not portable. 

It would seem that if there were a way of designating that some languages were "all 
uppercase" or "all lowercase" we could design a generalized solution. However, some 
languages which employ the concept of "case" also change the meaning of particular 
characters with diacritical marks: the cedilla in Spanish, the circumflex in French, and the 
umlaut in German. For this reason, any case-sensitive collating scheme that attempts to be 
comprehensive will be nightmarishly complex to use. 

Although we usually treat the C++ string as a class, this is really not the case, string is a 
typedef of a more general constituent, the basic_string< > template. Observe how string is 
declared in the standard C++ header file: 



To really understand the nature of strings, it's helpful to delve a bit deeper and look at the 
template on which it is based. Here's the declaration of the basic_string< > template: 



Earlier in this book, templates were examined in a great deal of detail. The main thing to 
notice about the two declarations above are that the string type is created when the 
basic_string template is instantiated with char. Inside the basic_string< > template 
declaration, the line 

tells us that the behavior of the class made from the basic_string< > template is specified by 
a class based on the template char_traits< >. Thus, the basic_string< > template provides for 
cases where you need string oriented classes that manipulate types other than char (wide 
characters or Unicode, for example). To do this, the char_traits< > template controls the 
content and collating behaviors of a variety of character sets using the character comparison 
fiinctions eq( ) (equal), ne( ) (not equal), and It( ) (less than) upon which the lKisic_string< > 
string comparison fiinctions rely. 

This is why the string class doesn't include case insensitive member functions: That's not in 
its job description. To change the way the string class treats character comparison, you must 
supply a different char_traits< > template, because that defines the behavior of the individual 
character comparison member functions. 

This information can be used to make a new type of string class that ignores case. First, we'll 
define a new case insensitive char_trails< > template that inherits the existing one. Next, 
we'll override only the members we need to change in order to make character-by -character 
comparison case insensitive. (In addition to the three lexical character comparison members 
mentioned above, we'll also have to supply new implementation of nnd() and conipare().) 
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Finally, well typedef a new class based on basic_strlng, but using the c: 
khar_trails template for its second argument. 

//: C01:ichar_traits.h 

// Creating your own character traits 

#ifndef ICHAR_TRAITS_H 

#define ICHAR_TRAITS_H 

// We'll only change character by 

static bool eq (char cist, char c2nd) { 

std : : toupper (cist) == std:: toupper ( c2nd) ; 
1 
static bool ne(char cist, char c2nd) { 

std: : toupper (cist) != std: : toupper (c2nd) ; 
} 
static bool It (char cist, char c2nd) { 

std: :toupper (cist) < std :: toupper ( c2nd) ; 
} 

const char*- str2, size_t n) { 
for (int i = 0; i < n ; i + +) { 

if (std: :tolower ('■strl)>std: : tolower ( * str2 ) ) 

if (std: :tolower ('■strl)<std: : tolower ( * str2 ) ) 

if C-strl == I I '-str2 == 0) 

return 0; 
strl++; str2++; // Compare the other chars 



int n, char c) { 
while (n — > fifi 

std: : toupper (^-sl) != std: : toupper (c) ) 
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} 

lendif // ICHAR_TRAITS_H III:- 
If we typedef an istring class like this: 

cing<char, icha 



allocato 



<cha 



ing; 



Then this istring will act like an ordinary string in every way, except that it will make 
comparisons without respect to case. Here's an example: 

// : COl :ICompare.cpp 
#include "ichar_trait3 . h" 
#include <string> 
#include <iostream> 

using namespace std; 









int mainl) { 

// The same letters except for case: 

istring second = "ThIS"; 

cout << first . compare (second) << endl ; 
} ///:- 

The output from the program is "0", indicating that the strings compare as equal. This is 
simple example - in order to make istring fully equivalent to string, we'd have to creati 
other functions necessary to support the new istring type. 



A string application 
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3 it is presented here. 



The input is an HTML file that contains the usual stuff along with an applet tag with a 
parameter that begins like this: 
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The rest of the hiie contams encoded information about the site map, all combined into a 
single line (it's rather long, but fortunately string objects don't care). Each entry may or may 
not begin with a number of '#' signs; each of these indicates one level of depth. If no '#' sign 
is present the entry will be considered to be at level one. After the '#' is the text to be 
displayed on the page, followed by a '%' and the URL to use as the link. Each entry is 
terminated by a '*'. Thus, a single entry in the line might look like this: 

I ###|Useful Art%./Biiild/useful_art.html* 

The 'I' at the beginning is an artifact that needs to be removed. 



My solution was to create an Item class whose constructor would take input text and create an 
object that contains the text to be displayed, the URL and the level. The objects essentially 
parse themselves, and at that point you can read any value you want. In inaiD( ), the input file 
is opened and read until the line contains the parameter that we're interested in. Everything 
but the site map codes are stripped away from this string, and then it is parsed into Item 
objects: 

t.cpp 



// 


CDl:SiteMapC 


// 


Using strings 


// 


program that 


#i 


elude " . . /req 


#i 


elude <iostre 


#i 


elude <fstrea 


#i 


elude <string 


#i 


elude <cstdli 



ing id, url ; 
depth; 



1 
public: 

Item (string in, intS index) : depth ( ) { 

while lin[index] == '#' &S index < in.sizel))| 
depth++; 
index++; 
} 

// means no '#' marks were found: 
if (depth == 0) depth = 1; 
while lin[index] != '%' SS index < in.sizeO) 
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id += in[index++] ; 

id = removeBar (id) ; 

index++; // Move past '%' 

while (in[index] != '*' SS index < 
url += ii![iiidex+ + ] ; 

url = removeBar (url) ; 

index++; // To move past '*' 
1 

string identifier!) { return id; } 
string path () { return url; 1 
int level 1) { return depth; 1 



nt main lint argc, char* argv[]) { 
requireArgs (argc, 1, 

"usage: SiteMapConvert inputf ilename" ) ; 
ifstream in(argv[l] ) ; 
assuredn, argv[l]); 
of stream out ("plainmap.html") ; 

while (getline (in, line)) { 

if (line, find ('■<param name = \ " source_f ile\ " " ) 

// Extract data from start of sequence 
// until the terminating quote mark: 
line = line. substr (line. find("value=\"") 

+ string ("value-X"") .sizel) ); 
line = line. substr (0, 

line.find_last_of 1"\"") ); 
int index = 0; 

while(index < line.sizeO) | 

Item item (line, index); 

string startLevel, endLevel; 

if (item. level == 1) | 

StartLevel = "<hl>"; 

endLevel = "</hl>"; 

forlint i = 0; i < item . level () ; i++) 
for lint j = 0; j < 5; j + +) 
out << "Snbsp;"; 
string htmlLine = "<a href=\"" 
+ item. path 1) + '■\">" 
+ item. identifier 1) + "</a><br>"; 
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tartLevel 
ndLevel « 



ak; // Out of while loop 



Item contains a private member function renioveBar() that is used internally to strip off ttie 
leading bars, if they appear. 

The constructor for Item initializes depth to to indicate that no '#' signs were found yet; if 
none are found then it is assumed the Item should be displayed at level one. Each character in 
the string is examined using operator[ ] to find the depth, id and uri values. The other 
member functions simply return these values. 

After opening the files, main( ) uses string: :nnd( ) to locate the line contahiing the site map 
data. At this point, substr( ) is used in order to strip off the information before and after the 
site map data. The subsequent while loop performs the parsing, but notice that the value index 
is passed by reference into the Item constructor, and that constructor increments index as it 
parses each new Item, thus movhig forward in the sequence. 

If an Item is at level one, then an HTML hi lag is used, otherwise the elements are indented 
using HTML non-breaking spaces. Note in the initialization of htmlLine how easy it is to 
construct a string- you can just combine quoted character arrays and other string objects 
using opera tor+. 

When the output is written to the destination file, startLevel and endLevel will only produce 
results if they have been given any value other than their default initialization values. 



Summary 



C i< ;l[iii DhJKit p[oudHnilgptri > ith I III htr gf {FMt H intifts g)i[ lliiii [ 
c 1 1 1 1 1 [ p n I ; . F 1 1 I li I I g U p 1 [ I , I li I string class makes referring to strings through the use of 
character pointers unnecessary. This eliminates an entire class of software defects that arise 
from the use of uninitialized and incorrectly valued pohiters. C++ strings dynamically and 
transparently grow their internal data storage space to accommodate increases in the size of 
the string data. This means that when the data in a string grows beyond the limits of the 
memory initially allocated to it, the string object will make the memory management calls that 
take space from and return space to the heap. Consistent allocation schemes prevent memory 
leaks and have the potential to be much more efficient than "roll your own" memory 
management. 

The string class member functions provide a fairly comprehensive set of tools for creating, 
modifying, and searching in strings, string comparisons are always case sensitive, but you 
can work around this by copying string data to C style null terminated strings and using case 
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insensitive string comparison functions, temporarily converting the data held in sting objet 
to a single case, or by creating a case insensitive string class which overrides the character 
traits used to create the basic_string object. 



Exercises 



A palindrome is a word or group of words that read the same forward and 
backward. For example "madam" or "wow". Write a program that takes a 
string argument from the command line and returns TRUE if the string was 
a palindrome. 

Sometimes the input from a file stream contains a two character sequence to 
represent a newline. These two characters (OxOa OxOd) produce extra blank 
lines when the stream is printed to standard out. Write a program that finds 
the character OxOd (ASCII carriage return) and deletes it from the string. 
W rite a program that reverses the order of the characters in a string. 
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2: lo streams 



There's much more you can do with the general I/O problem 
than just take standard I/O and turn it into a class. 

Wouldn't it be nice if you could make all the usual "receptacles" - standard I/O, files and 
even blocks of memory - look the same, so you need to remember only one interface? That's 
the idea behind iostreams. They're much easier, safer, and often more efficient than the 
assorted functions from the Standard C stdio library. 

lostream is usually the first class library that new C++ programmers learn to use. This chapter 
explores the use of iostreams, so they can replace the C I/O functions through the rest of the 
book. In future chapters, you'll see how to set up your own classes so they're compatible with 



Why iostreams? 
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//: C02:FileClass.h 
// Stdio files wrapped 
#ifndef FILECLAS_H 
#define FILECLAS_H 
#include <cstdio> 

std: iFILE* f ; 
public: 

FileClass (const char* fn 

-FileClass () ; 

std: :FILE'- f p 1 ) ; 
1; 
#endif // FILECLAS_H ///:- 



In C when you perform file I/O, you work with a naked pointer to a FILE struct, but this class 
wraps around the pointer and guarantees it is properly initialized and cleaned up using the 
constructor and destructor. The second constructor argument is the file mode, which defaults 
to "r" for "read." 

To fetch the value of the pointer to use in ihe file I/O functions, you use the fp( ) access 
function. Here are the member function definitions: 

// : C02 :FileClass . cpp {01 
// Implementation 
linclude "FileClas s . h" 
#include <cstdlib> 

using namespace std; 

FileClass: :FileClass (const char* fname, const char*" mode){ 
f = fopen (fname, mode) ; 
if (f == NULL) ! 

printf("%s: file not found\n", fname); 

exit (1) ; 



FileClass: i-FileClass { fclose(f); ) 

FILE*- FileClass: :fp() { return f; ) ///:- 

The constructor calls fopen( ),as you would normally do, but it also checks to ensure the 
result isn't zero, which indicates a failure upon opening the file. If there's a failure, the n 
of the file is printed and exit() is called. 

The destructor closes the file, and the access function fp( )returns f. Here's a simple exai 
using class FileClass: 

// : C02 :FileClassTest.cpp 
//{LI FileClass 
// Testing class File 
linclude "FileClass. h" 
linclude ".. /require . h" 



requireArgs (argc, 1) ; 
FileClass f(argv[l]); // Ope. 
const int bsize = lOD; 
char buf [bsize] ; 

while (fgets (buf, bsize, f . f p 
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I // F 
///■■- 



ally closed by des 



You create the FileClass object and use it in normal C file I/O function calls by calling fp( ). 
When you're done with it, just forget about it, and the file is closed by the des 
end of the scope. 



True wrapping 



Even 111 a lift llif FILE pn Id Im is p riv iie, it isi '1 f itiic ii lirli safi biciiisi fpO retrieves it. The 
only effect seems to be guaranteed initialization and cleanup, so why not make it public, or 
use a struct instead? Notice that while you can get a copy of fusing fp(), you cannot assign 
tof- that's completely under the control of the class. Of course, after capturing the pointer 
returned by fp( ), the client programmer can still assign to the structure elements, so the safety 
is in guaranteeing a valid FILE pointer rather than proper contents of the structure. 

If you want complete safety, you have to prevent the user from direct access to the FILE 
pointer. This means some version of all the normal file I/O functions will have to show up as 
class members, so everything you can do with the C approach is available in the C++ class: 

//: C02:Fullwrap.h 

// Completely hidden file 10 

lifndef FULLWRAP_H 

#define FULLWRAP_H 



std: iFILE* f ; 

std::FILE'- Fl); // Produces checked pointi 
public: 

FileO; // Create object but don't open f 
File (const char* path, 

const char*- mode = "r"); 
-FileO; 
int open (const char* path, 

const char* mode = "r"); 
int reopen (const char* path, 

const char* mode) ; 
int getcO; 



c (int c) ; 



putc(ii 
puts (c, 



ad(void* ptr. 
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int eof 1) ; 




int close () ; 




int flush ; 




int seekdong offset, i 


t whence) ; 


int getpos (fpos_t' pos) 




int setpos (const fpos_t 


pos); 


long tell () ; 




void rewind 0; 




void 3etbuf(char* buf); 




int setvbuf (char* buf. 


nt type, s 



#endif // FULLWRAP_H III:- 

This class contains almost all the file I/O functions from cstdio. vfpriiitf( ) is missing; it is 
used to implement the printf( ) member function. 

File has the same constructor as in the previous example, and it also has a default constructor. 
The default constructor is important if you want to create an array of File objects or use a File 
object as a member of another class where the initialization doesn't happen in the constructor 
(but sometime after the enclosing object is created). 

The default constructor sets the private FILE pointer f to zero. But now, before any reference 
to f, its value must be checked to ensure it isn't zero. This is accomplished with the last 
member function in the class, F( ), which is private because it is intended to be used only by 
other member functions. (We don't want to give the user direct access to the FILE structure 
in this class. )^ 

This is not a terrible solution by any means. It's quite functional, and you could imagine 
making similar classes for standard (console) I/O and for in-core formatting (reading/writing a 
piece of memory rather than a file or the console). 

The big stumbling block is the runtime interpreter used for the variable -argument list 
functions. This is the code that parses through your format string at runtime and grabs and 
interprets arguments from the variable argument list. It's a problem for four rt 

1 . Even if you use only a fraction of the functionality of 

whole thing gets loaded. So if you say: 



The impleiTienlalioii and (est files fo]' FULLWRAP are available in Ihe freely distribiiled 
source code fo]' this book. See preface for details. 
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printf ("%c", 'x' ) ; 

you'll get the whole package, including the parts that print out floating- 
point numbers and strings. There's no option for reducing the amount of 
space used by the program. 

Because the interpretation happens at luntime there's a performance 
overhead you can't get rid of It's frustrating because all the information is 
there in the format string at compile time, but it's not evaluated until 
runtime. However, if you could parse the arguments in the format string at 
compile time you could make hard function calls that have the potential to 
be much faster than a runtime interpreter (although the printf( ) family of 
functions is usually quite well optimized). 

A worse problem occurs because the evaluation of the format string doesn't 
happen until runtime: there can be no eompile-time error checking. You're 
probably very familiar whh this problem if you've tried to find bugs that 
came from using the wrong number or type of arguments in a printf() 
statement. C++ makes a big deal out of compile-time error checking to find 
errors early and make your life easier. It seems a shame to throw it away for 
an I/O library, especially because I/O is used a lot. 

For C++, the most important problem is that the printf() family of 
functions is not particularly extensible. They're really designed to handle 
the four basic data types in C (char, int, float, double and their variations). 
You might think that every time you add a new class, you could add an 
overloaded printf( ) and scaiif( ) function (and their variants for files and 
strings) but remember, overloaded functions must have different types in 
their argument lists and the printf( ) family hides its type information in the 
format string and in the variable argument list. For a language like C++, 
whose goal is to be able to easily add new data types, this is an ungainly 



lostreams to the rescue 
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In addition to gaining a great deal of leverage and clarity in your dealings with I/O and 
formatting, you'll also see how a really powerful C++ library can work. 

Sneak preview of operator overloading 
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In Chapter XX, you learned how function overloading allows you to use the same function 
name with different argument lists. Now imagine that when the compiler sees an expression 
consisting of an argument followed by an operator followed by an argument, it simply calls a 
function. That is, an operator is simply a function call with a different syntax. 

Of course, this is C++, which is very particular about data types. So there must be a 
previously declared function to match that operator and those particular argument types, or 
the compiler will not accept the expression. 

What most people find immediately disturbing about operator overloading is the thought that 
maybe everything they know about operators in C is suddenly wrong. This is absolutely false. 
Here are two of the sacred design goals of C++: 

1. A program that compiles in C will compile in C++, The only compilation 
errors and warnings from the C++ compiler will result from the "holes" in 
the C language, and fixing these will require only local editing. (Indeed, the 
complaints by the C++ compiler usually lead you directly to undiscovered 
bugs in the C program.) 

2. The C++ compiler will not secretly change the behavior of a C program by 
recompiling it under C++. 

Keeping these goals in mind will help answer a lot of questions; knowing there ate no 
capricious changes to C when moving to C++ helps make the transition easy. In particular, 
operators for built-in types won't suddenly start working differently - you cannot change their 
meaning. Overloaded operators can be created only where new data types are involved. So 
you can create a new overloaded operator for a new class, but the expression 

I 1 « 4; 

won't suddenly change its meaning, and the illegal code 

I 1.414 « 1; 
won't suddenly stait working. 
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Inserters and extractors 



In the iostreams library, two operators have been overloaded to make the u 

easy. The operator « is often referred to as an inserter for iostreams, and the operator > 

often referred to a: 



A stream is an object that formats and holds bytes. You can have an input stream (istream) or 
an output stream (psiream). There are different types of istreams and ostreams: ifstreams and 
ofstreams for files, istrstreams , and ostrstreams for char* memory (in-core formatting), and 
istringstreams & ostringstreams for interfacing with the Standard C++ string class. All these 
stream objects have the same interface, regardless of whether you're working with a file, 
standard I/O, a piece of memory or a siring object. The single interface you learn also works 
for extensions added to support new classes. 

If a stream is capable of producing bytes (an istream), you can get information from the 
sfream using an extractor. The exfractor produces and formats the type of information that's 
expected by the destination object. To see an example of this, you can use the cin object, 
which is the iostream equivalent of stdin in C, that is, redirectable standard input. This object 
is pre-defmed whenever you include the iostream.li header file. (Thus, the iostream library is 
atically linked with most compilers.) 



char buf [100] ; 
cin » buf; 

There's an overloaded operator » for every data type you can use as the right-hand 
argument of » in an iostream statement. (You can also overload your own, which you'll 
in a later chapter.) 

To find out what you have in the various variables, you can use the cout object 
(corresponding to standard output; there's also a cerr object corresponding to standard e 
with the inserter «: 
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cout « "buf = "; 
cout « buf; 

This is notably tedious, and doesn't seem like much of an improvement over priiitf( ), type 
checking or no. Fortunately, the overloaded inserters and extractors in iostreams are designed 
to be chained together into a complex expression that is much easier to write: 



You'll understand how this can happen in a later chapter, but for now it's sufficient to take the 
attitude of a class user and just know it works that way. 



Manipulators 



One new element has been added here: a manipulator called endl. A manipulator acts on the 
stream itself; in this case it inserts a newline and flushes the stream (puts out all pending 
characters that have been stored in the internal stream buffer but not yet output). You can also 
just flush the stream: 

I cout « flush; 

There are additional basic manipulators that will change the number base to oct (octal), dec 
(decimal) or hex (hexadecimal): 



There's a manipulator for 



and a manipulator called ends, which is like endl, only for strstreams (covered in a while). 
These are all the manipulators in <iostreani>, but there are more in <ion]anip> you'll see 
later in the chapter. 



Common usage 



A lit B g li ein and the extractor » provide a nice balance to cout and the inserter «, in 
practice using formatted input routines, especially with standard input, has the same problems 
you run into with scanf(). If the input produces an unexpected value, the process is skewed, 
and it's very difficult to recover. In addition, formatted input defaults to whitespace 
delimiters. So if you collect the above code fragments into a program 
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// : C02 : losexamp . cpp 
// lostream examples 
linclude <io3tream> 
using namespace std; 

int mainl) { 



char buf [100]; 






cin » buf; 






cout « "i = " 


« i 


« endl; 


coot « "f = '■ 


« f 


« endl; 


cout « "c = " 


« c 


« endl; 


cout « "buf = 




buf « e 


cout « flush; 






cout « hex « 




« i « 


} ///:- 







and give it the following input, 

I 12 1.4 c this is a test 

you'll get the same output as if you give it 

12 
1.4 

and the output is, somewhat unexpectedly. 
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Notice that buf got only the first word because the input routine looked for a space to delimit 
the input, which it saw after "this." In addition, if the continuous input string is longer than 
the storage allocated for buf, you'll overrun the buffer. 

It seems tin and the extractor are provided only for completeness, and this is probably a good 
way to look at it. In practice, you'll usually want to get your input a line at a time as a 
sequence of characters and then scan them and perform conversions once they're safely in a 
buffer. This way you don't have to worry about the input routine choking on unexpected data. 

Another thing to consider is the whole concept of a command-line interface. This has made 
sense in the past when the console was little more than a glass typewriter, but the world is 
rapidly changing to one where the graphical user interface (GUI) dominates. What is the 
meaning of console I/O in such a world? It makes much more sense to ignore cin altogether 
other than for very simple examples or tests, and take the following approaches: 

1. If your program requires input, read that input from a file - you'll soon see 
it's remarkably easy to use files with iostreams. lostreams for files still 
works fine whh a GUI. 

2. Read the input without attempting to convert h. Once the input is someplace 
where it can't foul things up during conversion, then you can safely scan it. 

3. Output is different. If you're using a GUI, cout doesn't work and you must 
send it to a file (which is identical to sending it to coul) or use the GUI 
facilities for data display. Otherwise it often makes sense to send it to cout. 
In both cases, the output formatting fiinctions of iostreams are highly useful. 



Line-oriented input 



To ji)b inpiiti line iti line, yon line t«'o clicices: llie nieBiberfmictioiis get() and 
getline( ). Both functions take three arguments: a pointer to a character buffer in which to 
store the result, the size of that buffer (so they don't overrun it), and the terminating character, 
to know when to stop reading input. The terminating character has a default value of '\n', 
which is what you'll usually use. Both fiinctions store a zero in the result buffer when they 
encounter the terminating character in the input. 

So what's the difference? Subtle, but important: get( ) stops when it sees the delimiter in the 
input stream, but it doesn't extract it from the input stream. Thus, if you did another get() 
using the same delimiter it would immediately return with no fetched input. (Presumably, you 
either use a different delimiter in the next get( ) statement or a different input function.) 
getline( ), on the other hand, extracts the delimiter from the input stream, but still doesn't 
store it in the result buffer. 

Generally, when you're processing a text file that you read a line at a time, you'll want to use 
getline( ). 
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Overloaded versions of get( ) 



get( ) also comes in three other overloaded versions: one with no arguments that returns the 
next character, using an int return value; one that stuffs a character into its char argument, 
using a reference (You'll have to jump forward to Chapter XX if you want to understand it 
right this minute ....); and one that stores directly into the underlying buffer structure of 
another iostream object. That is explored later in the chapter. 



Reading raw bytes 



If you know exactly w hal you're dealing w irh and want to in ove the bytes d irectly into a 
variable, array, or structure in memory, you can use read(). The first argument is a pointer to 
the destination memory, and the second is the number of bytes to read. This is especially 
useful if you've previously stored the information to a file, for example, in binary form using 
the complementary write( ) member function for an output stream. You'll see examples of all 
these functions later. 



Error liandling 



All the versions of get() and getline() return the input stream from which the characters 
came except for gel( ) with no arguments, which returns the next character or EOF. If you get 
the input stream object back, you can ask it if it's still OK. In fact, you can ask any iostream 
object if it's OK using the member functions good( ), eof( ), &iil( ), and bad( ). These return 
state information based on the eofbit (indicates the buffer is at the end of sequence), the 
&iilbil (indicates some operation has failed because of formatting issues or some other 
problem that does not affect the buffer) and the badbit (indicates something has gone wrong 
with the buffer). 

However, as mentioned earlier, the state of an input stream generally gets corrupted in weird 
ways only when you're trying to do input to specific types and the type read from the input is 
inconsistent with what is expected. Then of course you have the problem of what to do with 
the input stream to coirect the problem. If you follow my advice and read input a line at a 
time or as a big glob of characters (with read( )) and don't attempt to use the input formatting 
fiinctions except in simple cases, then all you're concerned with is whether you're at the end 
of the input (EOF). Fortunately, testing for this turns out to be simple and can be done inside 
of conditionals, such as while(ciD) or if(ciD). For now you'll have to accept that when you use 
an input stream object in this context, the right value is safely, correctly and magically 
produced to indicate whether the object has reached the end of the input. You can also use the 
Boolean NOT operator !, as in if(!cin), to indicate the stream is not OK; that is, you've 
probably reached the end of input and should quit trying to read the stream. 

There are times when the stream becomes not-OK, but you understand this condition and 
want to go on using it. For example, if you reach the end of an input file, the eofbit and failbit 
are set, so a conditional on that stream object will indicate the stream is no longer good. 
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However, you may want to continue using tlie file, by seeking to an earlier position and 
reading more data. To correct the condition, simply call the c)ear( ) member function.' 



File iostreams 



Vi I L ip u li lii • [ill s > [III io H[i 1 1 s ii I D cl n iin iti ii fi [ ih i i ui < cstdio in C. All you do 
to open a file is create an object; the constructor does the work. You don't have to explicitly 
close a file (although you can, using the clf>se( ) member function) because the destructor will 
close it when the object goes out of scope. 

To create a file that defaults to input, make an ifstream object. To create one that defaults to 
output, make an ofstream object. 

Here's an example that shows many of the features discussed so far. Note the inclusion of 
<fstreani> to declare the file I/O classes; this also includes <iostreaiii>. 

// : C02 : Strf ile.cpp 

// Stream I/O with files 

// The difference between get ( ) S getlinel) 

linclude ".. /require . h" 

#include <fstream> 

#include <iostream> 

using namespace std; 

int mainl) { 

const int sz = 100; // Buffer size; 

char buf [sz] ; 

{ 

ifstream in ( " Strf lie . cpp" ) ; // Read 

assure (in, " Strf lie . cpp" ) ; // Verify open 

ofstream out ( "Strf ile . out" ) ; // Write 

assure (out, "Strf ile . out" ) ; 

int i = 1; // Line counter 

// A less-convenient approach for line input: 
while (in. get (buf, sz)) { // Leaves \n in input 

in.getO; // Throw away next character l\n) 

cout « buf « endl; // Must add \n 

// File output just like standard I/O: 



Newer iiiiplemenlalions of io.slreams will si ill support this slyle of handling ei 
some cases will also throw exceptions. 



Chapter 14: Templates <& Conta 



ifstream in ( "Strf ile . ouf ' ) ; 

assure(in, "Strf ile . out" ) ; 

// More convenient line input: 

while (in. getline (buf, sz)) { // Removes \n 

char* cp = buf; 

while l*cp ! = ' : ' ) 
cp + +; 

cout « cp « endl; // Must still add \n 
1 
1 1 1:- 

The creation of both the ifstream and ofstreaiii are followed by an assure( ) to guarantee the 
file has been successfully opened. Here again the object, used in a situation where the 
compiler expects an integral result, produces a value that indicates success or failure. (To do 
this, an automatic type conversion member function is called. These are discussed in Chapter 
XX.) 

The first while loop demonstrates the use of two forms of the get( ) function. The first gets 
characters into a buffer and puts a zero terminator in the buffer when either sz - 1 characters 
have been read or the third argument (defaulted to '\n') is encountered. get( ) leaves the 
terminator character in the input stream, so this terminator must be thrown away via in.get( ) 
using the form of get( ) with no argument, which fetches a single byte and returns it as an int. 
You can also use the ignore( ) member function, which has two defaulted arguments. The 
first is the number of characters to throw away, and defaults to one. The second is the 
character at which the ignore( ) function quits (after extracting it) and defaults to EOF. 

Next you see two output statements that look very similar: one to cout and one to the file out. 
Notice the convenience here; you don't need to worry about what kind of object you're 
dealing with because the formatting statements work the same with all ostream objects. The 
first one echoes the line to standard output, and the second writes the line out to the new file 
and includes a line number. 

To demonstrate getline( ), it's interesting to open the file we just created and strip off the line 
numbers. To ensure the file is properly closed before opening it to read, you have two choices. 
You can surround the first part of the program in braces to force the out object out of scope, 
thus calling the destructor and closing the file, which is done here. You can also call close( ) 
for both files; if you want, you can even reuse the in object by calling the open() member 
function (you can also create and destroy the object dynamically on the heap as is in Chapter 
XX). 
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The second while loop shows how getline( ) removes the terminator character (its third 
argument, which defaults to '\n') from the input stream when it's encountered. Although 
getline( ). like get( ), puts a zero in the buffer, it still doesn't insert the terminating character. 



Open modes 



Flag 


Function 


ios::in 


Opens an input file. Use this as an open 
mode for an ofstream to prevent 
truncating an existing file. 


ios::out 


Opens an output file. When used for an 
ofstream without ios::app, ios::ate or 
ios::in, ios::tninc is implied. 


ios::app 


Opens an output file for appending. 


ios::ate 


Opens an existing file (either input or 
output) and seeks the end. 


ios::nocreate 


Opens a file only if it already exists. 
(Otherwise it fails.) 


ios:: no rep lace 


Opens a file only if it does not exist. 
(Otherwise it fails.) 


ios::truii(- 


Opens a file and deletes the old file, if 
it already exists. 


ios:: binary 


Opens a file in binary mode. Default is 
text mode. 



These flags can be combined using a bitwisi 



lostream buffering 



[0 1 1 1 1 'It i I lit [1 ( ml private to avoid confusion. Normally when using ic 

don't know or care where the bytes are being produced or consumed; indeed, this is different 
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depending 






1 wlielher you're dealing with standard 1/0. files, memory, o 



There comes a time, however, when 
part of the iostream that produces an 
interface and still hide its underlying impli 
slreambuf. Each iostream object 



i important to be able to send messages tc 
^s bytes. To provide this part with a 

; abstracted mto its own class, called 
1 pointer to some kind of slreambuf. (The kind 



)Youc 

of the streambuf. 

accomplished, of course, by 






depends on whether it deals with standard I/O, files, memory, 
streambuf directly; for example, you can move raw byte 
without formatting them through the enclosing iostream. 
calling member functions for the streambuf object. 

Currently, the most important thing for you to know is that every iostream objec 
zeesh^:.po inter to a streambuf object, and the streambuf has some member functions you can call if 
e" iris^you need to. 

To allow you to access the streambuf, every iostream object has a member function called 
rdbuf( ) that returns the pointer to the object's streambuf. This way you can call any member 
function for the underlying streambuf. However, one of the most interesting things you can 
do with the streambuf pointer is to connect it to another iostream object using the « 
operator. This drains all the bytes from your object into the one on the left-hand side of the 
«. This means if you want to move all the bytes from one iostream to another, you don't 
have to go through the tedium (and potential coduig errors) of readmg them one byte or one 
line at a time. It's a much more elegant approach. 

For example, here's a very simple program that opens a file and sends the contents out to 
standard output (similar to the previous example): 



//: C02:Stype.cpp 



^J 



m 



spa. 



td; 



nt main lint argc, char* argv [ ] ) ( 
requireArgs large, 1); // Must have a command line 
ifstream inlargv[l] ); 

assure lin, argv[l]); // Ensure file exists 
cout << in.rdhufl); // Outputs entire file 

After making sure there is an argument on the command line, an ifstream is created using this 
argument. The open will fail if the file doesn't exist, and this failure is caught by the 
assert(in). 

All the work really happens in the statement 
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which causes the entire contents of the file to be sent toconl. This is not only more succinct 
to code, it is often more efficient than moving the bytes one at a time. 

Using get( ) with a streambuf 

T 1 ett ii I fc rn o f e(I( ) that allows you to write directly into the streambuf of another 
object. The first argument is the destination streambuf (whose address is mysteriously taken 
using a reference, discussed in Chapter XX), and the second is the terminating character, 
which stops the gel( ) function. So yet another way to print a file to standard output is 

// : C02 :Sbufget.cpp 

// Get directly into a streambuf 

#include ".. /require . h" 

#include <fstream> 

#include <iostream> 



int mainl) { 

if stream in ( " Sbuf get . cpp" ) ; 

assure (in, "Sbuf get . cpp" ) ; 

while (in. get (*cout.rdbuf () ) ) 
in. ignore 0; 
} ///:- 

rdbuf( ) returns a pointer, so it must be dereferenced to satisfy the function's need to see an 
object. The get( ) function, remember, doesn't pull the terminating character from the input 
stream, so it must be removed using ignore( ) so get( ) doesn't just bonk up against the 
newline forever (which it will, otherwise). 

You probably won't need to use a technique like this very often, but it may be useful to know 



Seeking in iostreams 



islream) or go (if it's an ostream). In some situations you may want to move this stream 
position. You can do it using two models: One uses an absolute location in the stream called 
the streampos; the second works like the Standard C library functions fseek( ) for a file and 
moves a given number of bytes from the beginning, end, or current position in the file. 

The streampos approach requires that you first call a "tell" function: tellp( ) for an ostream 
or tellg( ) for an istream. (The "p" refers to the "put pointer" and the "g" refers to the "get 
pointer.") This function returns a streampos you can later use in the single-argument version 
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of seekp( ) for an ostreain or seekg( ) for an istream, when you w 



The second approach is a relative seek and u 
The first argument is the number of bytes to 
argument is the seek direction: 



s overloaded versions of seekp( ) and seekg( ). 

3ve: it may be positive or negative. The second 



From beginning of si 



Here's an example that shows the movement through a file, but remember, you're not limited 
to seeking within files, as you are with C and cstdio. With C-H-, you can seek in any type of 
iostream (although the behavior of cin & cout when seeking is undefined): 

I // : C02 : Seeking. cpp 

#include ".. /require . h" 
#include <iostream> 
#include <fstream> 
using namespace std; 

int main lint argc, char*" argv [ ] ) { 

requireArgs (argc, 1) ; 

ifstream in (argv [ 1 ] ) ; 

assure(in, argv[l]); // File must already exist 

in.seekg(0, ios::end); // End of file 

streampos sp = in.tellgl); // Size of file 

cout « "file size = " « sp « endl ; 

in.3eekg(-sp/10, ios::end); 

streampos sp2 = in.tellg(); 

in.seekg(0, ios::beg); // Start of file 

cout « in.rdbufO; // Print whole file 

in . seekg (sp2 ) ; // Move to streampos 

// Prints the last 1/lOth of the file: 

cout « endl « endl « in . rdbuf ( ) « endl; 
} ///:- 

This program picks a file name off the command line and opens it as an ifstream. assert( ) 
detects an open failure. Because this is a type of istream, seekg() is used to position the "get 
pointer." The first call seeks zero bytes off the end of the file, that is, to the end. Because a 
streampf>s is a typedef for a long, calling tellg() at that point also returns the size of the file, 
which is printed out. Then a seek is performed moving the get pointer 1/10 the size of the file 
-notice it's a negative seek from the end of the file, so it backs up from the end. If you try to 
seek positively from the end of the file, the get pointer will just stay at the end. The 
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streampos at that point is captured into sp2, then a seekg( ) is performed back to the 
beginning of the file so the whole thing can be printed out using the streambuf pointer 
produced with rdbnf( ). Finally, the overloaded version of seekg( ) is used with the 
streampos sp2 to move to the previous position, and the last portion of the file is printed o 



Creating read/write files 



Now tliilyojtiKi* ilmiililie streambuf and how to seek, you can understand how lo create 
a stream object that will both read and write a file. The following code first creates an 
ifstream with flags that say it's both an input and an output file. The compiler won't let you 
write to an ifstream, however, so you need to create an ostream with the underlying stream 
buffer: 

I ifstream in ("filename", ios : : in | ios : : out ) ; 
I ostream out ( in . rdbuf ( ) ) ; 

You may wonder what happens when you write to one of these objects. Here's an example: 

//: C02:Iofile.cpp 
// Reading S writing one file 
#include ".. /require . h" 
#include <iostream> 

#include <fstream> 
using namespace std; 

int mainO ! 

ifstream in ( "lof ile . cpp" ) ; 

assure (in, "lof ile . cpp" ) ; 

ofstream out ( "lof ile . out" ) ; 

a3sure(out, "lof ile . out" ) ; 

out « in. rdbuf 1); // Copy file 

in. close ; 

out. close 1); 

// Open for reading and writing: 

ifstream in2("Iofile. out", ios : : in | ios::out); 

assure(in2, "lof ile . out" ) ; 

ostream out2 (in2 . rdbuf ()) ; 

cout « in2. rdbuf ; // Print whole file 

out2 << "Where does this end up?"; 

out2.seekp(0, ios: :beg) ; 

out2 << "And what about this?"; 

in2 .seekg (0, ios : :beg) ; 

cout « in2. rdbuf ; 
} III:- 
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The first five lines copy the source code for this program into a file called ioflle.out. and then 
close the files. This gives us a safe text file to play around with. Then the aforementioned 
technique is used to create two objects that read and write to the same file. In cont « 
iii2.rdbiif(), you can seethe "get" pointer is initialized to the beginning of the file. The "put" 
pointer, however, is set to the end of the file because "Where does this end up?" appears 
appended to the file. However, if the put pointer is moved to the beginning with a seekp( ), all 
the inserted text overwrites the existing text. Both writes are seen when the get pointer is 
moved back to the beginning with a seekg( ), and the file is printed out. Of course, the file is 
automatically saved and closed when oiitl goes out of scope and it 



stringstreams 



strstreams 



i d H i I h [ t I n ( stringstreams, there were the more primitive strstreams. Although these 
are not an official part of Standard C-H-, they have been around a long time so compilers will 
no doubt leave in the strstream support in perpetuity, to compile legacy code. You should 
always use stringstreams, but it's certainly likely that you'll come across code that uses 
strstreams and at that point this section should come in handy. In addition, this section 
should make it fairly clear why stringstreams have replace strstreams. 

A strstream works directly with memory instead of a file or standard output. It allows you to 
use the same reading and formatting functions to manipulate bytes in memory. On old 
computers the memory was referred to as core so this type of functionality is often called in - 

The class names for strstreams echo those for file streams. If you want to create a strstream to 
extract characters from, you create an istrstream. If you want to put characters into a 
I, you create an ostrstream. 

you must deal with the issue of where the memory 

;n't terribly complicated, but you must understand it and 

easy to lose track of this particular issue, thus the birth 



String streams work with memory, sc 
comes from and where it goes. This i 
pay attention (it turned out is was toe 
of st rin^ rea m s ] . 



User-allocated storage 



W itii istrstreams this is the only allowed approach. There ai 



o constructors: 
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The first constructor takes a pointer to a zi 
until the zero. The second constructor add 
doesn't have to be zero -terminated. You c 
not you encounter a zero along the way. 

When you hand an istrstream 

filled with the characters you w 
type. Here's a simple example: 

//: C02:Istring.cj 
// Input stratrean 

#include <io3treaii 
#include <strstreE 



"47 1.414 Thi 



i >> f; // Whitespace-de 

biif2 [100] ; 

buf2; 

« " bi]f2 = " « bi]f2 « 

« s.rdbufO; // Get the 



^ro -terminated character array; you can extract bytes 

itionally requires the size of the array, which 

in extract bytes all the way to buf[size], whether or 

the address of an array, that array must already be 
LCt and presumably format into some other data 



ndl; 



} ///:- 

You can see that this is a more flexible and general approach to transforming character strings 
to typed values than the Standard C Library functions like atof( ), atoi( ), and so on. 

The compiler handles the static storage allocation of the string in 



You can also hand it a pointer to a 



3-terminated string allocated on the stack or the heap. 



In s » i » f, the first number is extracted into i and the second into f. This isn't "the first 
whitespace -delimited set of characters" because it depends on the data type it's being 
extracted into. For example, if the string were instead, "1.414 47 This is a test," then i would 
get the value one because the input routine would stop at the decimal point. Then f would get 
0.414. This could be useful if you want to break a floating-point number into a whole number 
and a fraction part. Otherwise it would seem to be an error. 

As you may already have guessed, buf2 doesn't get the rest of the string, just the next 
whitespace -delimited word. In general, it seems the best place to use the extractor in 
s is when you know the exact sequence of data in the input stream and you're 

some type other than a character string. However, if you want to extract the rest 
of the string all at once and send it to another iostream, you can use rdbuf( ) as shown. 
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Output str streams 



Output strstreams also allow you to provide your own storage; in this case it's the place in 
memory the bytes are formatted into. The appropriate constructor is 

The first argument is the preallocated buffer where the characters will end up, the second is 
the size of the buffer, and the third is the mode. If the mode is left as the default, characters 
are formatted into the starting address of the buffer. If the mode is either ios::ate or ios::app 
(same effect), the character buffer is assumed to already contain a zero -terminated siring, and 
any new characters are added starting at the zero terminator. 

The second constructor argument is the size of the array and is used by the object to ensure it 
doesn't overwrite the end of the array. If you fill the array up and try to add more bytes, they 

An important thing to remember about f>strstreanis is that the zero terminator you normally 
need at the end of a character array is not inserted for you. When you're ready to zero- 
terminate the string, use the special manipulator ends. 

Once you've created an ostrstream you can insert anything you want, and it will magically 
end up formatted in the memory buffer. Here's an example: 

// : C02 :Ostriiig.cpp 
// Output strstreams 
#iiiclude <iostream> 

#include <strstream> 
using namespace std; 



cin >> ws; // Throw away white space 

char buf [sz]; 

cin.getline (buf, s z ) ; // Get rest of the li 

// (cin.rdbuf 1) would be awkward) 

ostrstream OS (buf , sz, ios::app); 

OS « endl; 

OS « "integer = " « i « endl; 



buf ; 

os.rdbuf 1) ; // Sai 
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cout << os.rdbufl); // WOT the same effect 
I } ///:- 

This is similar to the previous example in fetching the int and float. You might think the 
logical way to get the rest of the line is to use rdbuf(); this works, but it's awkward because 
all the input including newlines is collected until the user presses control-Z (control-D on 
Unix) to indicate the end of the input. The approach shown, using getliDe( ), gets the input 
until the user presses the carriage return. This input is fetched into buf, which is subsequently 
used to construct the ostrst ream OS. If the third argument ios::app weren't supplied, the 
constructor would default to writing at the beginning of buf, overwriting the line that was just 
collected. However, the "append" flag causes it to put the rest of the formatted information at 
the end of the string. 

You can see that, like the other output streams, you can use the ordinary formatting tools for 
sending bytes to the ostrstream. The only difference is that you're responsible for inserting 
the zero at the end with ends. Note that endl inserts a newline in the strstream, but no zero. 

Now the information is formatted in buf, and you can send it out directly with cout « buf. 
However, it's also possible to send the information out with os.rdbuf( ). When you do this, 
the get pointer inside the streambuf is moved forward as the characters are output. For this 
reason, if you say cout « os.rdbuf( ) a second time, nothing happens - the get pointer is 
already at the end. 

Automatic storage allocation 

ilpiil ilrilreirn s (bit not istrstreams) give you a second option for memory allocation: they 
can do it themselves. All you do is create an ostrstream with no constructor arguments: 



Now a takes care of all its own storage allocation on the heap. You can put as many bytes into 
a as you want, and if it runs out of storage, it will allocate more, moving the block of memory, 
if necessary. 

This is a very nice solution if you don't know how much space you'll need, because it's 
completely flexible. And if you simply format data into the strstream and then hand its 
streambuf off to another iostream, things work perfectly: 

I a « "hello, world, i = " « i « endl « ends ; 
I cout « a.rdbuf 1); 

This is the best of all possible solutions. But what happens if you want the physical address of 
the memory that a's characters have been formatted into? It's readily available - you simply 
call the str( ) member function: 

I char*- cp = a.str ; 

There's a problem now. What if you want to put more characters into a? It would be OK if 
you knew a had already allocated enough storage for all the characters you want to give it, but 
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that's not true. Generally, a will run out of storage when you give it more characters, and 
ordinarily it would try to allocate more storage on the heap. This would usually require 
moving the block of memory. But the stream objects has just handed you the address of its 
memory block, so it can't very well move that block, because you're expecting it to be at a 
particular location. 

The way an ostrstream handles this problem is by "freezing" itself. As long as you don't use 
str( ) to ask for the internal char*, you can add as many characters as you want to the 
ostrstream. It will allocate all the necessary storage from the heap, and when the object goes 
out of scope, that heap storage is automatically released. 

However, if you call str( ), the ostist ream becomes "frozen." You can't add any more 
characters to it. Rather, you aren't supposed to - implementations are not required to detect 
the error. Adding characters to a frozen ostrstream results in undefmed behavior. In addition, 
the ostrstream is no longer responsible for cleaning up the storage. You took over that 
responsibility when you asked for the char* with str( ). 

To prevent a memory leak, the storage must be cleaned up somehow. There are two 
approaches. The more common one is to directly release the memory when you're done. To 
understand this, you need a sneak preview of two new keywords in C-H-: new and delete. As 
you'll see in Chapter XX, these do quite a bit, but for now you can think of them as 
replacements for malloc( ) and free( ) in C. The operator new returns a chunk of memory, and 
delete frees it. It's important to know about them here because virtually all memory allocation 
in C-i-i- is performed with new, and this is also true with ostrstream. If it's allocated with 
new, it must be released with delete, so if you have an ostrstream a and you get the char* 
using str( ), the typical way to clean up the storage is 

I delete []a.strl); 

This satisfies most needs, but there's a second, much less common way to release the storage: 
You can unfreeze the ostrstream. You do this by calling freeze( ), which is a member 
function of the ostrstream's streambuf. freeze( ) has a default argument of one, which 
freezes the sfream, but an argument of zero will unfreeze it: 

I a.rdljuf ()->freeze (0) ; 

Now the storage is deallocated when a goes out of scope and its destructor is called. In 
addition, you can add more bytes to a. However, this may cause the storage to move, so you 
better not use any pointer you previously got by calling str( ) - it won't be reliable after 
adding more characters. 

s the ability to add more characters after a stream has been 



The following exampl 
unfrozen: 




//: C02:Walr 
// Freezing 
#include <io 
#include <st 




using namesp 
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'The 



', the walrus 



id, ■' 



cout « s.strO « endl; // String is frozen 

// the streambuf storage on the heap 
3.3eekp(-l, io3::cur); // Back up before NULL 
s.rdbuf ()->freeze(0) ; // Unfreeze it 
// Now destructor releases memory, and 
// you can add more characters (but you 
// better not use the previous str() value) 
s << " 'To speak of many things'" << ends; 
cout « s.rdbuf 1) ; 
} ///:- 

After putting the fii^t string into s, an ends is added so the string can be printed using the 
char* produced by str( ). At that point, s is frozen. We want to add more characters to s, but 
for it to have any effect, the put pointer must be backed up one so the next character is placed 
on top of the zero inserted by ends. (Otherwise the sfring would be printed only up to the 
original zero.) This is accomplished with seekp( ). Then s is unfrozen by fetching the 
underlying slreambuf pointer using rdbnf( ) and calling freeze(0). At this point s is like it 
was before calling slr( ): We can add more characters, and cleanup will occur automatically, 
with the destructor. 

It is possible to unfreeze a 
practice. Normally, if you 
ostrstream, you create a i 
e adding ni 



ostrstream and continue adding characters, but it is not common 
/ant to add more characters once you've gotten the char* of a 
w one, pour the old stream into the new one using rdbuf( ) and 
le new ostrstream. 



Proving movement 



If you're still no I convinced you should be responsible for the storage of a ostrstream if you 
call str( ), here's an example that demonstrates the storage location is moved, therefore the 
old pointer returned by slr( ) is invalid: 



elude 
elude 
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char* old = s.strl); // Freezes s 
s.rdbuf l)->freeze (0) ; // Unfreeze 
for(int i = 0; i < 100; i++) 

s « "howdy"; // Should force reallocation 
cout « "old = " « (void*) old « endl ; 

delete s.strO; // Release storage 

} ///:- 

After inserting a string to s and capturing the char* with slr( ), the string is unfrozen and 
enough new bytes are inserted to virtually assure the memory is reallocated and most likely 
moved. After printing out the old and new char* values, the storage is explicitly released with 
delete because the second call to str( ) froze the string again. 

To printout addresses instead of the strings they point to, you must cast the char* to a void*. 
The operator « for char* prints out the string it is pointing to, while the operator « for 
void* prints out the hex representation of the pointer. 

ft"s interesting to note that if you don't insert a string tos before calling str( ), the result is 
zero. This means no storage is allocated until the first time you try to insert bytes to the 
ostrstream. 

A better way 

A gain, rem ember 111 at this section was only left in to support legacy code. Y on should always 
11 s e string and stringstreani rather than character arrays and strstreani. The former is much 
safer and easier to use and will help ensure your projects get finished faster. 

Output stream formatting 

nt 1 U.; ^ 111 gf ;hs cIhi. lod HI [lisc i .\:::ti\ :] f :\ d ii sum s. is it iilti ) o u U 
(Mil) I nt Hi tin ill ti hiis fiti gii }\ut It 1 1 1 ik n. It ( i rti ii h « giMi'l W iti) istdl 
if t • I 1 1 1 II g 'N D 1 11 tl i fg [I I Itii J i ill ll c priDtf()faniily of functions. In this section, 
you'll leam all the output formatting functions that are available for iostreams, so you can get 
your bytes the way you want them. 



The formatting functions in iostreams can be somewhat confusing at first because there's 
often more than one way to control the formatting: through both member functions and 
manipulators. To further confuse things, there is a generic member ftinction to set state flags 
to control formatting, such as left- or right-justification, whether to use uppercase letters for 
hex notation, whether to always use a decimal point for floating-point values, and so on. On 
the other hand, there are specific member functions to set and read values for the fill 
character, the field width, and the precision. 
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In an attempt to clarify all this, the internal formatting data of an iostream is examined first, 
along with the member functions that can modify that data. (Everything can be controlled 
through the member functions.) The manipulators are covered separately. 



Internal formatting data 



Tie c 1 1 s 1 ios (which you can see in the header file <iostreain>J contains data members to 
store all the formatting data pertaining to that stream. Some of this data has a range of values 
and is stored in variables: the floating-point precision, the output field width, and the 
character used to pad the output (normally a space). The rest of the formatting is determined 
by flags, which are usually combined to save space and are referred to collectively as the 
format flags. You can find out the value of the format flags with the ios: :f1ags( ) member 
function, which takes no arguments and returns a long (typedefed to fintflags) that contains 
the current format flags. AH the rest of the functions make changes to the format flags and 
return the previous value of the format flags. 



tflags 


ios: :flags Ifmtflags newflags); 




tflags 


ios: :setf (fmtflags ored_flag); 




tflags 


ios: :unsetf (fmtflags clear_flag) ; 




tflags 


ios: :setf Ifmtflags bits, fmtflags 


field) ; 


St function forces all the flags to change, which you do some 


imes. More often, you 



The useof setf() can seem more confusing: To know which overloaded version to use, yc 
must know what type of flag you're changing. There are two types of flags: ones that are 
simply on or off, and ones that work in a group with other flags. The on/off flags are the 
simplest to understand because you turn them on with self (fintflags) and off with 
unsetf(fmtflags). These flags are 



on/off flag 


effect 


ios:: skip ws 


Skip white space. (For input; this is the 
defauh.) 


ios::showbase 


Indicate the numeric base (dec, oct, or 
hex) when printing an integral value. 
The formal used can be read by the 
C++ compiler. 


ios::showpoint 


Show decimal point and trailing zeros 
for floating-point values. 
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on/off flag 


effect 


ios:: uppercase 


Display uppercase A-F for 
hexadecimal values and E for scientific 
values. 


ios::showpos 


Show plus sign (+) for positive values. 


ios::unilbuf 


"Unit buffering." The stream is flushed 
after each insertion. 


ios::stdLO 


Synchronizes the stream with the C 
standard I/O system. 



For example, to show the plus sign for cout. you say cout.setf(ios: :showpos). To sto|' 
showing the plus sign, you say cout.unsetf(ios::showpos). 

The last two flags deserve some explanation. You turn on unit buffering when you want 
make sure each character is output as soon as it is inserted into an output stream. You co 
also use unbuffered output, but unit buffering provides better performance. 

The ios::stdio flag is used when you have a program that uses both iostreams and the C 
standard I/O library (not unlikely if you're using C libraries). If you discover your 
output and printf( ) output are occurring in the wrong order, try setting this flag. 

Format fields 

The second type of form ailing flags work in a group. Y ou can have only one of tliese flags on 
at a time, lite the buttons on old car radios - you push one in, the rest pop out. U n fortunately 
this doesn't happen aulom a tic ally, and you have to pay attention to what flags you're setting 
so you don't accidentally call the wrong setf() function. For example, there's a flag for each 
of the number bases: hexadecimal, decimal, and octal. Collectively, these flags are referred to 
as the if>s::lKisefield. If the ios::dec flag is set and you callsetf(ios::hex), you'll set the 
ifis::hex flag, but you won 7 clear the ios::dec bit, resulting in undefined behavior. The proper 
thing to do is call the second form of setf( ) like this: self(ios::hex, ios::basefield). This 
function first clears all the bits in the ios::basefield, then sets ifis::hex. Thus, this form of 
setf( ) ensures that the other flags in the group "pop ouf ' whenever you set one. Of course, the 
hex( ) manipulator does all this for you, automatically, so you don't have to concern yourself 
with the internal details of the implementation of this class or to even care that it's a set of 
binary flags. Later you'll see there are manipulators to provide equivalent functionality in all 
the places you would use setf( ). 



Here are the flag groups and their effects: 

ios::basefield 
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ios::basefield 


effect 1 


ios::dec 


Format integral values 
(decimal) (default radix 


nbase 10 

■ 


ios::hex 


Format integral values 
(hexadecimal). 


nbase 16 


]os::oct 


Format integral values 
(octal). 


n base 8 



ios::floatfield 


effect 


ios::scientLfLC 


Display floating-point numbers in 
scientific format. Precision field 
indicates number of digits after the 
decimal point. 


ios:: fixed 


Display floating-point numbers in 
fixed format. Precision field 
indicatesnumber of digits after the 
decimal point. 


"automatic" (Neither bit 
is set.) 


Precision field indicates the total 
number of significant digits. 



ios::adjustfield 


effect 


ios::left 


Left-align values; pad on the right 
with the fill character. 


ios: Tight 


Right-align values. Pad on the left 
with the fill character. This is the 
default alignment. 


ios:: internal 


Add fill characters after any leading 
sign or base indicator, but before 
the value. 
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Width, fill and precision 



The internal variables that control the width of the output field, the fill character used when 
the data doesn't fill the output field, and the precision for printing floating-point numbers are 
read and written by member functions of the same name. 



function 


effect 


int ios::widlh( ) 


Reads the current width. {Default is 
0.) Used for both insertion and 
extraction. 


intios::width(intn) 


Sets the width, returns the previous 
width. 


int Jos::fLll( } 


Reads the current fill character. 
(Defauh is space.) 


int ios::fill(inl n} 


Sets the fill character, returns the 
previous fill character. 


int ios::precision( ) 


Reads current floating-point 
precision. (Default is 6.) 


int ios::precision(int n) 


Sets floating-point precision, 
returns previous precision. See 
iostifloatfield table for the meaning 

of "precision." 



The fill and precision values are fairly straightforward, but width requires some explanation. 
When the width is zero, inserting a value will produce the minimum number of characters 
necessary to represent that value. A positive width means that inserting a value will produce 
at least as many characters as the width; if the value has less than width characters, the fill 
character is used to pad the field. However, the value willnever be truncated, so if you try to 
print 123 with a width of two, you'll still get 123. The field width specifies a minimum 
number of characters; there's no way to specify a maximum number. 

The width is also distinctly different because it's reset to zero by each inserter or extractor 
that could be influenced by its value. It's really not a state variable, but an implicit argument 
to the inserters and extractors. If you want to have a constant width, you have to call width( ) 
after each inserlio 
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An exhaustive example 



To make sure you know how to call all the functions previously discussed, here's an example 
that calls them all: 

//: C02: Format. cpp 

// Formatting functions 

linclude <fstream> 



Idefine D (A) T « #A - 
ofstream T ( "format . oul 



int mainl) { 

Dlint i = 4 7;) 

Dlfloat f = 2300114.414159;) 

char* s = "Is there any more?"; 

D (T. setf (ios : :unitbuf ) ; ) 
// D (T. setf (ios : :stdio) ; ) // SOMETHING MAY HAVE CHANGED 

D (T.setf (ios : :showbase) ; ) 

D {T . setf (ios : :uppercase) ; ) 

D (T.setf (ios : : showpos ) ; ) 

D(T « i « endl;) // Default to dec 

D (T.setf (ios: :hex, ios: :basefield) ; ) 

D(T « i « endl; ) 

D(T.unsetf (ios: : uppercase ); ) 

D (T.setf (ios: :oct, ios: :basefield) ; ) 

D(T « i « endl; ) 

D(T.unsetf (ios: :showbase) ; ) 

D (T.setf (ios: :dec, ios: :basefield) ; ) 

D (T.setf (ios: :left, ios: : ad justf ield) ; ) 

D(T.fill ('0' ) ; ) 

D(T « "fill char: " « T.filll) « endl;) 

D(T. width (10) ; ) 

T « i « endl; 

D (T.setf (ios: : right, ios :: ad justf ield) ; ) 

D(T. width (10) ; ) 

T « i « endl; 

D (T.setf (ios: : internal, ios :: ad justf ield) ; ) 

D(T. width (10) ; ) 

T « i « endl; 

DIT << i << endl;) // Without width (10) 
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T 


.unsetf ( 


OS : :showpos) ; ) 




T 


.setf (io 


: :showpoint) ; ) 




T 


« "pret 


= " « T.precisionl) « e 


ndl ; ) 


T 


.setf (io 


: :scientific, io3::floatf 


eld) ;) 


T 


« endl 


« f « endl;) 




T 


.setf (io 


::fixed, ios : : f loatf ield) 


) 


T 


« f « 


endl; ) 




T 


.setf (0, 


ios: :floatfield) ;) // Autc 


matic 


T 


« f « 


endl; ) 




T 


.precis! 


n(20);) 




T 


« "prec 


= " « T.precisionl) « e 


ndl;) 


T 


« endl 


« f « endl;) 




T 


.setf (io 


: :scientific, ios::floatf 


eld);) 


T 


« endl 


« f « endl;) 




T 


.setf (io 


::fixed, ios : : f loatf ield) 


) 


T 


« f « 


endl; ) 




T 


.setf (0, 


ios: :floatfield) ;) // Autc 


matic 


T 


« f « 


endl ; ) 





T. width (10) ; ) 

« s « endl; 

T. width (40) ; ) 

« s « endl; 

T. setf (ios: :left, ios :: ad just field) ; ) 

T. width (40) ; ) 

« s « endl; 

T.unsetf (ios: :showpoint) ; ) 

T.unsetf (ios : :unitbuf ) ; ) 
// D (T.unsetf (ios : :stdio) ; ) // SOMETHING MAY HAVE CHANGED 
( III:- 

This example uses a trick lo create a trace file so you can monitor what's happening. The 
macro D(a) uses the preprocessor "stringizing" to turn a into a string to print out. Then it 
reiterates a so the statement takes effect. The macro sends all the information out to a file 
called T, which is the trace file. The output is 



float f 


= 2300114.414159; 


T.setf (1 


os: :iinitbuf); 


T.setf (1 


os: :stdio) ; 


T.setf (1 


os: :showbase) ; 


T.setf (1 


os: :uppercase) ; 
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T « i « endl; 

+ 47 

T.setf (ios: :hex, ios : : basef ield) ; 

T « i « endl; 

+ 0X2F 

T.unsetf (ios: : uppercase) ; 

T.setf (ios: :oct, ios :: basef ield) ; 

T « i « endl; 

+ 057 

T.unsetf (ios : :showbase) ; 

T.setf (ios: :dec, ios: : basef ield) ; 

T.setf (ios: :left, ios: : ad jus tf ield) ; 

T.fill ('0' ) ; 

T « "fill char: " « T.fill () « endl; 

fill char: 

T.width(lO) ; 

+470000000 

T.setf (ios: : right, ios : : ad justf ield) ; 

T.width(lO) ; 

0000000+47 

T.setf (ios: : internal, ios :: ad justf ield) ; 

T.width(lO) ; 

+000000047 

T « i « endl; 

+ 47 

T.unsetf (ios : : showpos ) ; 

T.setf (ios : :showpoint) ; 

T « "prec = " « T.precisionO « endl; 

prec = 6 

T.setf (ios: : scientific, ios: :floatfield) ; 

T « endl « f « endl; 

2.300115e+06 

T.setf (ios: : fixed, ios: :floatfield) ; 

T « f « endl; 

2300114.500000 

T.setf (0, ios: :floatfield) ; 

T « f « endl; 

2.300115e+06 

T. precision (20) ; 

T « "prec = " « T.precisionO « endl; 

prec = 20 
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2300114.50000000020000000000 

T.setf (ios: : scientific, ios : :floatfield) ; 



2.30011450000000020000e+06 

T.setf (ios: : fixed, ios: :floatfield) ; 

T « f « endl; 

2300114.50000000020000000000 

T.setf (0, ios: :floatfield) ; 

T « f « endl; 

2300114.50000000020000000000 

T. width (10) ; 

Is there any more? 

T. width (40) ; 

OOOOOOOOOOOOOOOOOOOOOOIs there any more? 

T.setf (ios: :left, ios: : ad justf ield) ; 

T.width(40) ; 

Is there any more?0000000000000000000000 



T. 


nsetf (ios: 


showpoint) 


T. 


nsetf (ios: 


unitbuf); 


T. 


nsetf (ios: 


stdio); 



Studying this output siiould clarify your understanding of the iostream formatting member 
functions. 

Formatting manipulators 

n )g I (11 iii ([»! lit Mi« i'Hi liii fit. tilliM >ki » i» h( Mm: lion in ;il i Hi Hi in 
I g lilt (Uiii iisiii ID [id nd I [ilt, I lil gl g iiipgliig[s ii npp lid lg dgplititi Ih 
iMiDDi pigiiJiU)' llii I II h[ [giiliDDS. 

H 1 1 ip g I) ig [s 1 illi g g g i; 1 1 i g ii nt p ig lid t d ig <iostreaiii>. These include dec,oct, and 
hex , which perform the same action as, respectively, setf(ios::dec, ios::baseneld), 
setf(ios::oct, ios::basefleld). and setf(ios::he\, ios::basefleld). albeit more succinctly. 
<iostreain>^ also includes ws, endl, ends, and flush and the additional set shown here: 



These only appear in Ihe revised library; you won't find (hem in older impl< 
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manipulator 


effect 


showbase 
noshowbase 


Indicate the numeric base (dec, 
ocl, or hex) when printing an 
integral value. The format used 
can be read by the C++ 
compiler. 


showpos 
noshowpos 


Show plus sign (+) for positive 
values 


uppercase 
nouppercase 


Display uppercase A-F for 
hexadecimal values, and E for 
scientific values 


showpoint 
noshowpoint 


Show decimal point and trailing 
zeros for floating-point values. 


skipws 
noskipws 


Skip while space on input. 


left 
right 
internal 


Left-align, pad on right. 

Right-align, pad on left. 

Fill between leading sign or base 

indicator and value. 


scientific 
fixed 


Use scientific notation 
setprecision( ) or 
ios::precision( ) sets number of 
places after the decimal point. 



Manipulators with arguments 



< ioiiianip>. This 
arguments. In addi 



to solve the general problei 
X predefined manipulators: 



I of creating manipulati 



manipulator 
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manipulator 


effect 


setiosflags (fmtflags n) 


Sets only the format flags 




specified by n. Setting remains 




in effect until the next change. 




likeios::setf(). 


resetiosflags(fmtflags n) 


Clears only the format flags 




specified by n. Setting remains 




in effect until the next change. 




likeios::unsetf(). 


setbase{base n) 


Changes base to n, where n is 




10, 8, or 16. (Anything else 




results in 0.) If n is zero, output 




is base 10, but input uses the C 




conventions: 10 is 10, 010 is 8, 




and Oxf is 1 5. You might as well 




use dec, oct, and hex for output. 


selfLll(ehar n) 


Changes the fill character to n. 




likeios::nil(). 


setprecLsion(int n) 


Changes the precision to n, like 




ios::precision( ). 


setw(Lnt n) 


Changes the field width to n. 




likeios::width(). 



If you're using a lot of inserters, you can see how this can clean things up. As an example, 
here's the previous program rewritten to use the manipulators. (The macro has been removed 
to make it easier to read.) 

// : C02 :Manips . cpp 

// Format. cpp using manipulators 
#include <fstream> 

#include <iomanip> 
using namespace std; 



= 47; 

f = 2300114.414159; 
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« i « endl; // Default to dec 

« hex « i « endl; 

« resetiosflags (ios: :uppercase) 

oct « i « endl; 
setf (ios: :left, ios: : ad justf ield) ; 
<< resetiosflags (ios : :showbase) 

dec « setfill ('0' ) ; 
« "fill char: " « trc.filll) « endl; 
« setw(lO) « i « endl; 
setf (ios: : right, ios: : ad justf ield) ; 
« setw(lO) « i « endl; 
setf (ios: : internal, ios: : ad justf ield) ; 
« setw(lO) « i « endl; 
« i « endl; // Without setwllO) 

"prec = " « trc. precision 1) « endl; 
setf (ios: : scientific, ios: :floatfield) ; 
« f « endl; 

setf (ios: : fixed, ios: :floatfield) ; 
« f « endl; 
setf(0, ios: :floatfield) ; // Automatic 



setprecision(20) ; 

"prec = " « trc. precision 1) « endl; 

f « endl; 
tf (ios: : scientific, ios: :floatfield) ; 

f « endl; 
tf (ios: : fixed, ios: :floatfield) ; 

f « endl; 
tf(0, ios: :floatfield) ; // Automatic 

f « endl; 

setwllO) « s « endl; 
setwl40) « s « endl; 
tf (ios: :left, ios :: ad justf ield) ; 
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setiosflags ( 



1 III:- 

You can see thai a lot of the muhiple statements have been condensed into a single chained 
insertion. Note the calls to setiosflags( ) and resetiosflags( ), where the flags have been 
bitwise-ORed together. This could also have been done with setf( ) and unsetf( ) in the 
previous example. 



Creating manipulators 



lii I It . .^ f t [0 -I [{ 1 1 1 1 1 I 1 1 ip II li lg [ lih endl is simply a function that takes as its argument 
an oslream reference (references are a different way to pass arguments, discussed in Chapter 
XX). The declaration for endl is 

Now, when you say: 

I cout « "howdy" « endl ; 
the endl produces the address of that function. So the compiler says "is there a function 1 



11 that takes the addre 
lostream.h to do this: 
the ostream object as 



. of a function a: 
scalled anapplici 
argument. 



. argument?" There is a pre-defined function ii 



You don't need to know how the applicatc 

need to know the applicator exists. Here's 
emits a newline without flushing the streai 

//: C02:nl.cpp 

// Creating a manipulator 

#include <iostream> 



works to creati 
in example that 



ir calls the function, passing it 



ivn manipulator; you only 
a manipulator called nl that 
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} ///:- 
The expression 

calls a function that returns os, which is what is returned from nl.^ 

People often argue that the nl approach shown above is preferable to using endl because the 
latter always flushes the output stream, which may incur a performance penalty. 

Effectors 

A i you'vt seeD, iira-irsiiD eDi n niifulitors J!i qiilti iisy Ic neiK . B j 1 ¥ 111 if jai ¥1d1 lo 
crntc 1 n mlrnliliir lliil tites ir;ii n ents' The iostrcm libriry li i s i rillier co n (o In led iid 
cdhIdsId; v ») U do tli is, Im 1 Jeny S cli w ir; , tb c crnio r o I llie idsluini lib riry , sunesls'C a 
scheme he calls effectors. An effector is a simple class whose constructor performs the desired 
operation, along with an overloaded openitor« that works with the class. Here's an example 
with two effectors. The first outputs a truncated character string, and the second prints a 
number in binary (the process of defining an overloaded operator« will not be discussed 
until Chapter XX): 

//: C02:Effector.txt 

// (Should be "cpp" but I can't get it to compile with 

// My windows compilers, so making it a txt file will 

// keep it out of the makefile for the time being) 

// Jerry Schwarz ' s "effectors" 

#include<iostream> 

#include <cstdlib> 

#include <climits> // ULONG_MAX 



Before putting nl inio a header file, you should make l( an inline fujiction (see Chapter 7). 
hi a private conversation. 
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: str(s, 0, width) {) 
friend ostreamE 
operator<< (ostreamE os, FixwE fw) { 



typedef ■ 



// Print a numbe 

ulong n; 
public: 

Bin (ulong nn ) 
friend ostream 



streams operator<< (ostreamE os, BinE b) { 
ulong bit = ~1UL0NG_MAX >> 1); // Top bit 
while (bit) { 

OS << (b.n E bit ? ' 1 ' : ' ' ) ; 
bit »= 1; 



"Things that make us happy, make us wise"; 
for lint i = 1; i <= strlen ( string ) ; i + +) 

cout « Fixw(3tring, i) « endl; 
ulong X = OxCAFEBABEUL; 
ulong y = 0x7 6543210UL; 

cout << "x in binary: " << Binlx) << endl; 
cout << "y in binary: " << Binly) << endl; 
} ///:- 

The constructor for Fixw creates a shortened copy of its char* argument, and the destructor 
releases the memory created for this copy. The overloaded operator« takes the contents of 
its second argument, the Fixw object, and inserts it into the first argument, the ostream. then 
returns the ostream so it can be used in a chained expression. When you use Fixw in an 

expression like this: 
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a temporary objeci is created by the call to the Fixw constructor, and that temporary is passed 
to operator«. The effect is that of a manipulator with arguments. 

The Bin effector relies on the fact that shifting an unsigned number to the right shifts zeros 
into the high bits. ULONG_MAX (the largest unsigned long value, from the standard include 
file <climits> ) is used to produce a value with the high bit set, and this value is moved across 
the number in question (by shifting it), masking each bit. 

Initially the problem with this technique was that once you created a class called Fixw for 
char* or Bin for unsigned long, no one else could create a different Fixw or Bin class for 
tlieir type. However, with namespaces (covered in Chapter XX), this problem is eliminated. 



lostream examples 



li iHs St Mil I )'n'll lii ill t nil phs d i hi pi cii U « ill ill Hi iiltii ilin )ti'it 
Uuni ii llii iliMt'. t llitijl I II) Ktli II ill ID I I g in lilt hilts lilitii tdilMs liit sed 
and awk from Unix are perhaps the most well known, but a text editor also fits this category), 
they generally have some limitations, sed and awk can be slow and can only handle lines in a 
forward sequence, and text editors usually require human interaction, or at least learning a 
proprietary macro language. The programs you write with iostreams have none of these 
v. They're fast, portable, and flexible. It's a very useful tool to have in your kit. 



Code generation 



lie [list eiim pin con Etrn llie j en en lion c I |i ro gnu s III il, lo incid cnlill)' , 111 III t loim il 
iHi tool. Tils proviJts I llllle etin speeJ ud (anilsleiKy n ten dn-elopinj code. Tli 
rslprcjriin creitesilllelo liold niain() (assuming it takes no command-line argument; 
ses the iostream library): 

// : C02 :Makemain.cpp 

// Create a shell main ( ) file 

#include ".. /require . h" 

#include <fstream> 

#include <strstream> 

#include <cstring> 

#include <cctype> 

using namespace std; 

int main lint argc, char*" argv [ ] ) { 
requireArgs (argc, 1) ; 
ofstream mainf ile (argv [ 1] ) ; 
assure (mainf lie, argv[l]); 
istrstream name ( argv (1 ]) ; 
ostrstream CAPname; 
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while(name.get(c) ) 

CAPname « char ( toupper ( c ) ) ; 
CAPname << ends; 
mainfile « "//" « ": " « CAPname . rdbuf () 

« " — " « endl 

« endl 

« "mainO {" « endl « endl 
« '■}'■ « endl; 
} ///:- 

The argument on the command line is used to create an istrstream. so the ch; 
extracted one at a time and converted to upper case with the Standard C library macro 
toupper( ). This returns an int so it must be explicitly cast to a char. This name is used in the 
headline, followed by the remainder of the generated file. 

Maintaining class library source 

The necond example performs a more complex and useful task. Generally, when you create a 
class you think in library terms, and make a header file Name .h for the class declaration and a 
file where the member functions are implemented, called Name.cpp. These files have certain 
requirements: a particular coding standard (the program shown here will use the coding 
format for this book), and in the header file the declarations are generally surrounded by some 
preprocessor statements to prevent multiple declarations of classes. (Multiple declarations 
confuse the compiler - it doesn't know which one you want to use. They could be d ifferent, 
so it throws up its hands and gives an error message.) 

This example allows you to create a new header- implementation pair of files, or to modify an 
existing pair. If the files already exist, it checks and potentially modifies the files, but if they 
don't exist, it creates them using the proper format. 

[[ This should be changed to use string instead of <cstring> ]] 

// : C02 :Cppcheck.cpp 

// Configures .h fi . cpp files 

// To conform to style standard. 

// Tests existing files for conformance 

#include ".. /require . h" 

#include <fstream> 

#include <strstream> 

#include <cstring> 

#include <cctype> 

using namespace std; 
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:onst int sz = 40; // Buffer s 

:onst int bsz = 10 0; 

■equireArgs (argc, 1); // File s. 

rnum bufs | base, header, implei 
Hlinel, guardl, gi3ard2, guard 
CPPlinel, include, bufnum 1; 

:har b[bufnum] [sz]; 

.strstream osarray [ ] = { 
03trstream(b[base], sz), 
03trstream(b [header] , sz) , 
03trstream(b[implement] , sz), 
03trstream(b[Hlinel], sz), 
03tr3tream(b[guardl], sz), 
03tr3tream(b[guard2], sz), 
ostr3tream(b[guard3], sz), 
ostrstream(b[CPPlinelj , sz) , 
ostrstream(b[include], sz). 



osarray[base] « argv[l] « ends; 

// Find any '.'in the string using the 

// Standard C library function strchr(): 

char* period = strchr (b [base] , '.'); 

if (period) ^period = 0; // Strip extensio 

// Force to upper case: 

for(int i = 0; b[base][i]; i++) 

b[base][i] = toupper Iblbase] [i] ) ; 
// Create file names and internal lines: 
osarray [header] « b[base] « ".h" « end 
osarray [implement] << b[base] << ".cpp" < 
osarray[Hlinel] « " / / " « " : " « b[head 



Y 


g^ 


ardl] 


<< 


#ifndef 
H" « 


ends; 


b[bas 


Y 


g^ 


ard2] 


'='= 


#define 
H" « 


ends; 


b[bas 


Y 


g^ 


ard3] 


'='= 


#endif 
H" « 


// " 
ends; 


« b[b 


Y 


CPPline 


1] < 


:: '■//" < 


< '■: 





ay[include] « "#include \" 
« b[header] « 
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ifstream exi sth (b [header ]) , 

existcpp (b[ implement] ) ; 
ifdexisth) I // Doesn't exist; create it 
of stream newheader (b [header ]) ; 
assure (newheader, b [header] ) ; 
newheader « b[Hlinel] « endl 
« b[guardl] « endl 
« b[guard2] « endl « endl 
« b[guard3] « endl; 
1 

if (! existcpp) I // Create cpp file 

ofstream newcpp (b [ implement] ) ; 

assure (newcpp, b[ implement] ) ; 

newcpp << b[CPPlinel] << endl 

« b[include] « endl; 

} 

if(existh) ! // Already exists; verify it 
strstream hfile; // Write S read 
ostrstream newheader; // Write 
hfile « existh.rdbuf « ends; 
// Check that first line conforms: 
char buf [bsz] ; 
if (hfile. getline (buf, bsz)) { 

Istrstr (buf, b[header] ) ) 
newheader « b[Hlinel] « endl; 
1 

// Ensure guard lines are in header: 
if( strstr (hfile. strO, b[guardl]) || 
strstr (hfile. str(), b[guard2]) || 
strstr(hfile.str(), b[guard3])) | 
newheader « b[guardl] « endl 
« b[guard2] « endl 
« buf 

« hfile. rdbuf « endl 
« b[guard3] « endl « ends; 

newheader « buf 

« hf ile. rdbuf « ends; 
// If there were changes, overwrite file: 
if (strcmp (hfile. str () , newheader . str () ) !=0) { 

ofstream newH (b [header ]) ; 
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assure (newH, b [hea 


der] ) ; 




newH 


« "1 l%l 1" « 


endl // 


Ch 


« 


isewheader.rdb 


uf 1) ; 




1 
delete 


hfile.str 1); 






delete 


newheader.str 


1) ; 




(existcpp) 1 // Aire 


ady exis 


ts; 


strstre 


am cppfile; 






ostrst 


earn newcpp; 






cppfile << existcpp. 


rdbuf 1) 


« 



nds; 
char buf [bsz] ; 

// Check that first line conforms: 
if (cppfile. getlinelbuf, bsz)) 

Istrstr (buf, b[implement] ) ) 
newcpp << b[CPPlinel] << endl; 
// Ensure header is included: 
if (!strstr(cppfile.str(), b (include] ) ) 

newcpp « b[include] « endl; 
// Put in the rest of the file: 
newcpp « buf « endl; // First line read 
newcpp « cppf ile. rdbuf « ends; 
// If there were changes, overwrite file: 
if (strcmp (cppf ile. str () , newcpp. str () ) !=0) { 
existcpp. closeO; 
ofstream newCPP (b [ implement] ) ; 
assure (newCPP, b [implement] ) ; 
newCPP « "//e//" « endl // Change marker 
« newcpp.rdbuf 0; 
1 

delete cppfile. str () ; 
delete newcpp. str () ; 
1 
( ///:- 

This example requires a lot of string formatting in many different buffers. Rather than 
creating a lot of individually named buffers and ostrstream objects, a single set of names is 
created in the enom bufs. Then two arrays are created: an array of character buffers and an 
array of ostrstream objects bulk from those character buffers. Note that in the defmition for 
the two-dimensional array of char buffers b, the number of char arrays is determined by 
bufnum, the last enumerator in bufs. When you create an enumeration, the compiler assigns 
integral values to all the enum labels starting at zero, so the sole purpose of bufnum is to be a 
;r for the number of enumerators in buf. The length of each string in b is sz. 
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The names in the enumeration are base, the capitalized base file name without e 
header, the header file name; implement, the implementation file (cpp) name; HIinel, the 
skeleton first line of the header file; gaardl, guard!, and giiard3, the "guard" lines in the 
header file (to prevent muhiple inclusion); CPPIinel, the skeleton first line of the cpp file; 
and include, the line in the cpp file that includes the header file. 

osarray is an array of ostrstream objects created using aggregate initialization and automatic 
counting. Of course, this is the form of the ostrstream constructor that takes two arguments 
(the buffer address and buffer size), so the constructor calls must be formed accordingly 
inside the aggregate initializer list. Using the bnfs enumerators, the appropriate array element 
of b is tied to the corresponding osarray object. Once the array is created, the objects m the 
array can be selected using the enumerators, and the effect is to fill the corresponding b 
element. You can see how each string is built in the lines following the ostrstream array 
definition. 

Once the strings have been created, the program attempts to open existing versions of both the 
header and cpp file as ifstreams. If you test the object using the operator '!' and the file 
doesn't exist, the test will fail. If the header or implementation file doesn't exist, it is created 
using the appropriate lines of text built earlier. 

If the files </o exist, then they are verified to ensure the proper format is followed. In both 
cases, a strstream is created and the whole file is read in; then the first line is read and 
checked to make sure it follows the format by seeing if it contains both a "//:" and the name of 
the file. This is accomplished with the Standard C library function strstr( ). If the first line 
doesn't conform, the one created earlier is inserted into an ostrstream that has been created to 
hold the edited file. 

In the header file, the whole file is searched (again using strstr( )) to ensure it contains the 
three "guard" lines; if not, they are inserted. The implementation file is checked for the 
existence of the line that includes the header file (although the compiler effectively guarantees 
its existence). 

In both cases, the original file (m its strstream) and the edited file (in the ostrstream) are 
compared to see if there are any changes. If there are, the existing file is closed, and a new 
ofstream object is created to overwrite it. The ostrstream is output to the file after a special 
change marker is added at the beginning, so you can use a text search program to rapidly find 
any files that need reviewing to make additional changes. 

Detecting compiler errors 

All the code in this boot is designed lo compile as shown w ithout errors. A ny line of code 
that should generate a com pile -time error is commented out with the special com ment 
sequence "//!". The following program w ill rem ove these special com ments and append a 
numbered comment to the line, so that when you run your compiler it should generate error 
messages and you should see all the numbers appear when you compile all the files. It also 
appends the m odified line to a special file so you can easily locate any lines that don't 
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// : C02 : Showerr . cpp 
// Un-comment error gem 
linclude ".. /require . h" 
linclude <io3tream> 
linclude <fstream> 
linclude <strstream> 
#include <cctype> 
#incl(ide <cstring> 



r* usage = 

age: showerr filename chapnum\n" 
"where filename is a C++ source file\n" 

nd chapnum is the chapter name it's in.\n" 
"Finds lines commented with //! and removes\n" 
comment, appending //(#) where # is unique\n" 
across all files, so you can deterniine\n" 
if your compiler finds the error. \n" 
Ehowerr /r\n- 

// File containing error number counter: 
char* errnum = " . . /errnum . txt" ; 
// File containing error lines: 

of stream errlines (errf ile, ios : : app ) ; 

int main lint argc, char*" argv[]) { 

if (argv[l] [0] == V II argv[l][0] == '-') { 
// Allow for other switches: 
switch(argv[l] [1] ) | 

remove (errnum) ; // Delete files 
remove (errf ile) ; 
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char* chapter = argv[2]; 
strstream edited; // Edited file 



ifstream inf lie (argv [ 1 j ) ; 

assure (inf lie, argv[l]); 

ifstream count (errnum) ; 

assure (count, errnum) ; 

if (count) count » counter; 

int linecount = 0; 

const int sz = 255; 

char buf [sz] ; 

while (infile.getline (buf, sz}) ! 

linecount++; 

// Eat white space: 

while(isspace(buf [i] ) ) 

// Find marker at start of line: 

if (strstr (fibuf [i], marker) == Sbuf[i]) { 

// Erase marker: 

memset(Sbuf [i], ' ', strlen (marker )) ; 

// Append counter fi error info: 

<< "Chapter " << chapter 
« '■ File: '■ « argv[l] 
« " Line " « linecount « endl 
« ends; 
edited « buf; 
errlines « buf; // Append error file 

edited << buf << "\n"; // Just copy 
1 
1 // Closes files 

ofstream outf ile (argv [ 1 ] ) ; // Overwrites 
assure (outfile, argv(l]); 
outfile « edited. rdbuf ; 
ofstream count (errnum) ; // Overwrites 
assure (count, errnum); 
count « counter; // Save new counter 
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The marker can be replaced with one of your choice. 

Each file is read a line at a time, and each line is searched for the marker appearing at the head 
of the line; the line is modified and put into the error line list and into the strst ream edited. 
When the whole file is processed, it is closed (by reaching the end of a scope), reopened as an 
output file and edited is poured into the file. Also notice the counter is saved in an external 
file, so the next time this program is invoked it continues to sequence the counter. 



A simple datalogger 



frcdiiinj.Tlie eniDfl! ii meinlto prcJiife i ttmptritiirrJtpili pjofilt cMlit 
virions polils. To bold lie dm, i cins is iistd: 

// : C02 :DataLogger .h 

// Datalogger record layout 

#ifndef DATALOG_H 

#define DATALOG_H 

#inclTide <ctime> 

#include <io3tream> 

class DataPoint { 



// As 


cii degrees (*) minut 


char 


latitudelbsz] , longit 


doubl 


e depth, temperature; 


blic: 




std: : 


tm getTime () ; 


void 


setTime (std: : tm t) ; 


const 


char* getLatitude ( ) ; 


void 


setLatitude (const cha 


const 


char* getLongitude ( ) 


void 


setLongitude (const ch 


doubl 


e getDepth () ; 


void 


setDepth (double d) ; 


doubl 


e getTemperature () ; 


void 


setTemperature (double 


void 


print (std: :ostreamS o 



#endif // DATALOG_H / / / : - 

The access functions provide controlled reading and writing to each of the data members. The 
print( ) function formats the DataPoint in a readable form onto an ostreain object (the 
argument to print( )}. Here's the definition file: 



Chapter 14: Temj^lales i£ Conia 



// : C02 :Datalog.cpp {0} 
// Datapoint member functions 
linclude "DataLogger . h" 
#include <iomanip> 

linclude <cstring> 
using namespace std; 

tm DataPoint: igetTime 1) { return time; 

void DataPoint: : setTime (tm t) { time = 

const char* DataPoint :: getLatitude ( ) { 
return latitude; 



oid DataPoint: : setLatitude (const char* 1) { 
latitude[bsz - 1] = 0; 
strncpy (latitude, 1, bsz - 1); 



t char* DataPoint: igetLongitude { 
turn longitude; 



oid DataPoint: : setLongitude (const char* 1) { 
longitude[bsz - 1] = 0; 
strncpy (longitude, 1, bsz - 1); 



double DataPoint: :getDepth ( ) { return depth; 1 

void DataPoint: : setDepth (double d) { depth = d; ) 

double DataPoint : : getTemperature ( ) ! 

1 

void DataPoint: : setTemperature (double t) { 



id DataPoint: :print (ostreamS os) { 
os.setf (ios: : fixed, ios : :floatfield} ; 
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w(2) « getTimeO 


tm_mon << '\\' 


w(2) « getTimeO 


tm_mday << '\\' 


w(2) « getTimeO 


tm_year << ' ' 


w(2) « getTimeO 


tm_hour « ' : ■ 


w(2) « getTimeO 


tm_min << ' : ' 


w(2) « getTimeO 


tm_sec; 


•); // Pad on left with ' ' 


at:" « setw(9) « getLatitude () 


Long:" << setw(9) 


<< getLongitude 1) 


depth:" « setw(9 


« getDepthO 


temp:" « setw(9) 


<< getTemperature 



In print( ), the call to setf( ) causes the floating-point output to be fixed-precis 
precision( ) sets the number of decimal places to four. 

The default is to right-justify the data within the field. The time information c 
digits each for the hours, minutes and seconds, so the width is set to two with setw( ) in each 
case. (Remember that any changes to the field width affect only the next output operation, so 
setw( ) must be given for each output.) But first, to put a zero in the left position if the value is 
less than 10, the fill character is set to '0'. Afterwards, it is set back to a space. 

The latitude and longitude are zero -terminated character fields, which hold the information as 
degrees {here, '*' denotes degrees), minutes ('), and seconds("). You can certainly devise a 
more efficient storage layout for latitude and longitude if you desire. 



Generating test data 

re's a program thar creates a file of test data in bin 
; in ASCII form using DataPoint: :priiit( ). You c 
iier to inspect in file form. 

// : C02 :Datagen.cpp 
//(L) Catalog 
// Test data generator 
#include "DataLogger . h" 
#include ".. /require . h" 
#include <fstream> 
#include <cstdlib> 
#include <cstring> 



ry form (ii sing write( )) and a second 
n also print it out to the screen but it's 



mg . 



spa. 
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ofstream bindata ( "data . bin" , ios :: binary ) ; 
assure (bindata, "data. bin") ; 

// Seed random number generator: 

3rand(time(Stimer) ) ; 

for(int i = 0; i < lOD; i++) { 

DataPoint d; 

// Convert date/time to a structure: 

d.setTime (*localtime (Stimer) ) ; 

timer += 55; // Reading each 55 seconds 

d.setLatitude ("45*20 "JIX"") ; 

d.setLongitude ("22*34 '18\"") ; 

// Zero to 199 meters: 

double newdepth = rand () % 200; 

double fraction = rand ( ) % 100 + 1; 

newdepth += double (1) / fraction; 

d . setDepth (newdepth ) ; 

double newtemp = 150 + randl)%200; // Kelvin 

fraction = rand ( ) % 100 + 1; 

newtemp += (double) 1 / fraction; 

d . setTemperature (newtemp ) ; 

d. print (data) ; 

bindata. write ( (unsigned char* ) &d, 
sizeof (d) ) ; 
1 
///:- 

The file DATA.TXT is created in the ordinary way as an ASCII file, but DATA.BIN has the 
flag iosxbinary to tell the constructor to set it up as a binary file. 

The Standard C library function tiiiie( ), when called with a zero argument, returns the current 
time as a tinie_t value, which is the number of seconds elapsed since 00:00:00 GMT, January 
I 1970 (the dawning of the age of Aquarius?). The current time is the most convenient way to 
seed the random number generator with the Standard C library function srand( ), as is done 

Sometimes a more convenient way to store the time is as a tm structure, which has all the 
elements of the time and date broken up into their constituent parts as follows: 

int tm_sec; // 0-59 seconds 

int tm_hour; // 0-23 hours 
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m_mday; // Day ■ 



// 



nth 



-11 I 



rsths 



m_year; // Calendar year 
m_wday; // Sunday == 0, eti 
m_yday; // 0-365 day of ye, 
m_isdst; // Daylight savim 



To convert from the time in seconds to the local time in the Im format, you use the Standard 
C library Iocaltime( ) function, which takes the number of seconds and returns a pointer to the 
resulting tm. This tm, however, is a static structure inside the locallime( ) function, which is 
rewritten every time localtiine( ) is called. To copy the contents into the tm struct inside 
DataFoint, you might think you must copy each element individually. However, all you must 
do is a structure assignment, and the compiler will take care of the rest. This means the right- 
hand side must be a structure, not a pointer, so the result of localtiine() is dereferenced. The 
desired result is achieved with 



d.setTii 



3(*localt 



e (St 



r) ) ; 



[g interval between 



After this, the timer is incremented by 55 seconds to gh't 
readings. 

The latitude and longitude used are fixed values to indicate a set of readings at a single 
location. Both the depth and the temperature are generated with the Standard C library rand( ) 
function, which returns a pseudorandom number between zero and the constant 
RAND_MAX. To put this in a desired range, use the modulus operator % and the upper end 
of the range. These numbers are integral; to add a fractional part, a second call to rand( ) is 
made, and the value is inverted after adding one (to prevent divide-by-zero errors). 

In effect, the DATA.BIN file is being used as a container for the data in the program, even 
though the container exists on disk and not in RAM. To send the data out to the disk in binary 
form, write( ) is used. The first argument is the starting address of the source block — notice it 
must be cast to an unsigned char* because that's what the function expects. The second 
argument is the number of bytes to write, which is the size of the DataFoint object. Because 
no pointers are contained in DataFoint, there is no problem in writing the object to disk. If 
the object is more sophisticated, you must implement a scheme for serialization . (Most 
vendor class libraries have some sort of serialization structure built into them.) 



Verifying & viewing tlie data 



he validity of the dal 
n DATA2.TXT, so 1 



is read from tbe disk a 
)ATA.TXT forverifit 
^ry is. After the lest fil 



//: C02:Da 
//(L( Data 
// Verify , 



Chapter 14: Templates <& Conta 



^include " DataLogger . h" 

#include ". ./require. h" 

#include <io3tream> 

linclude <f3tream> 

linclude <strstream> 

linclude <iomanip> 
using namespace std; 

int mainl) { 

ifstream bindata (" data . bin" , ios :: binary ) ; 

assure (bindata, "data. bin") ; 

// Create comparison file to verify data . txt 

ofstream verif y ( "data2 . txt" ) ; 

assure (verify, "data2.txt"); 

DataPoint d; 

while (bindata . read ( 

(unsigned char'-jsd, sizeof d) ) 
d.print(verify); 
bindata.clear ; // Reset state to "good" 
// Display user-selected records: 
int recnum = ; 
// Left-align everything: 

cout.setf (ios: :left, ios : : ad justf ield) ; 
// Fixed precision of 4 decimal places: 
cout.setf (ios: : fixed, ios: :floatfield) ; 
cout.precision(4); 
for(;;) | 

bindata. seekg (recnum'- sizeof d, ios::beg); 
cout « "record " « recnum « endl ; 
if (bindata. read( 

(unsigned char*)Sd, sizeof d) ) { 
cout « asctinie(S (d.getTimeO ) ); 
cout << setw(ll) << "Latitude" 
<< setw(ll) << "Longitude" 
<< setw(lO) << "Depth" 
<< setw(12) << "Temperature" 
« endl; 
// Put a line after the description: 
cout « setfillC-') « setw(43) « '-' 

« setfill (' ' ) « endl; 
cout << setw(ll) << d.getLatitude () 
<< setw(ll) << d.getLongitude () 
<< setw(lO) << d.getDepthO 
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« setwll2) « d.getTemperature 1) 
« endl; 
1 else { 

cout << "invalid record number" << endl; 
bindata. clear 1) ; // Reset state to "good" 
1 
cout « endl 

<< "enter record number, x to quit:"; 
char buf [10] ; 
cin.getline (buf, 10) ; 
if (buf [0] == 'x') break; 
istrstream input (buf , 10); 
input » recnum; 
1 
1 1 1 :■■■ 

The ifstream bindata is created from DATA. BIN as a binary file, with the ios::iiocreate flag 
on to cause the assert( ) to fail if the file doesn't exist. The read( ) statement reads a single 
record and places it directly into the DataPoint d. (Again, if DataPoint contained pointers 
this would result in meaningless pointer values.) This read() action will set bindata' s failbit 
when the end of the file is reached, which will cause the while statement to fail. At this point, 
however, you can't move the get pointer back and read more records because the state of the 
stream won't allow further reads. So the clear( ) function is called to reset the failbit. 

Once the record is read in from disk, you can do anything you want with it, such as perform 
calculations or make graphs. Here, it is displayed to further exercise your knowledge of 
iostream formatting. 

The rest of the program displays a record number (represented by recnum) selected by the 
user. As before, the precision is fixed at four decimal places, but this time everything is left 
justified. 

The formatting of this output looks different from before: 

ecord 

lie Nov 16 18:15:49 1993 

atitude Longitude Depth Temperature 

5*20'31" 22'-34'18" 186.0172 269.0167 

To make sure the labels and the data columns line up, the labels are put in the same width 
fields as the columns, using setw( ). The line in between is generated by setting the fill 
character to '-', the width to the desired line width, and outputting a single '-'. 

If the read() fails, you'll end up in the else pari, which tells the user the record number was 
invalid. Then, because the failbit was set, it must be reset with a call to clear( ) so the next 
read( ) is successful (assuming it's in the right range). 
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Of course, you can also open the binary data file for writing as well as reading. This way you 
can retrieve the records, modify them, and write them back to the same location, thus creating 
a flat-file database management system. In my very first programming job, I also had to create 
a flat-file DBMS - but using BASIC on an Apple II. It took months, while this took minutes. 
Of course, it might make more sense to use a packaged DBMS now, but with C++ and 
iostreams you can still do all the low-level operations that are necessary in a lab. 



Counting editor 



son clligj, but ill llie otiicr tcit is diiplicitcd. I iicoiiDieicd lliis probieiD v it a pisting digitil 
pbolos iilo 1 W eb pige - I got tbe lorn iltioe jo it riglir, 111 es ill plicated il, tbco liid tlic 
prolilei of inert n en tin; lie pbolo od i ber for eicli oie. So I replaced tbe pbolo nil id bet i' itb 
X X X , deplieited tbn, iid wmle tbe lollo« in; proving to liid iid rep lie e Ibe "X X X ' w ilh 

in iicrerr enteJ cojitl, N oliee Ibe lotin illing . so tbe tiiee * ill be "001," -IIOI." etc.: 

// : C02 :NumberPhoto3 . cpp 

// Find the marker "XXX" and replace it with an 

// incrementing number whereever it appears. Used 

// to help format a web page with photos in it 

#include ".. /require . h" 

#include <fstream> 

#include <33tream> 

#include <iomanip> 

#include <string> 

using namespace std; 

requireArgs (argc, 2) ; 
ifstream in (argv [ 1 ] ) ; 
assure (in, argv[l]); 
ofstream out (argv [2 ] ) ; 
assure (out, argv[2]); 
string line; 
int counter = 1; 
while (getline (in, line)) { 
int XXX = line. find ("XXX") ; 

ostringstream cntr; 

cntr « setfill('O') « setwO) « counter + +; 

line. replace (xxx, 3, cntr.strO); 
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Breaking up big files 



Tlis pro sum «■ is creiled to tnil; iip hij (ild into iii iller c n ( i, in pirticilir so lliey tojld 
be in ort tisily dov nldideil [rem ID InttrnM servtr {since liiiieiipi sen tiim ts occur. Ihis 
illow s son cone lo dow gloid i lile i piece it a rim e ind rhcD te-isse ni I) le it it 111 e clieDl end 
Y 0(1 'II note tliit Ihe proem ilso creites i reissei biy bitcb lile for D S (ft ere il is 
in essierj.wliereis under LiniiiJIIiiii you sinply siy sonelliiiij lile 'cat *piece* > 
destination.flle". 

This program reads the entire file into memory, which of course relies on having a 32 -bit 
operating system with virtual memory for big files. It then pieces it out in chunks to the 
smaller files, generating the names as it goes. Of course, you can come up with a possibly 
more reasonable strategy that reads a chunk, creates a file, reads another chunk, etc. 

Note that this program can be run on the server, so you only have to download the big file 
once and then break it up once it's on the server. 

// : C02 : Breakup. cpp 

// Breaks a file up into smaller files for 

// easier downloads 

#include "../ require . h" 

#include <iostream> 

#include <fstream> 

#include <iomanip> 

#include <strstream> 

#include <string> 

using namespace std; 

requireArgs (argc, 1) ; 

ifstream in(argv[l], ios :: binary ) ; 

assure (in, argv[l]); 

in.seekq(0, ios::end); // End of file 

long fileSize = in.tellgO; // Size of file 

cout « "file size = " « fileSize « endl; 

in.seekg(0, ios::beg); // Start of file 

char* fbuf = new char [ fileSize] ; 

require (fbuf != ) ; 

in.read(fbuf, fileSize) ; 

in. close ; 

string inf ile ( argv [ 1 ] ) ; 
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int dot = infile.findl ' . ' ) ; 
while (dot != string : :npos ) { 

infile.replace(dot, 1, "-"); 

dot = infile.findC . ' ) ; 
1 
string batchName ( 

"DOSAssemble" + infile + ".bat"); 
of stream batchFile (batchWame . c_str () ) ; 
batchFile << "copy /b "; 
int filecount = ; 
const int sbufsz = 128; 
char sbuf [sbufsz] ; 

const long pieceSize = lOOOL * lOOL; 
long byteCounter = 0; 
while (byteCounter < fileSize) { 

ostrstream name (sbuf, sbufsz); 

name « argv [ 1 ] « "-part" « setfilll'O') 
« setw(2) « filecount++ « ends; 

cout « "creating " « sbuf « endl; 

if (filecount > 1) 
batchFile « "+"; 

batchFile « sbuf; 

ofstream out (sbuf, ios::out | ios :: binary ) ; 

long byteq; 

if (byteCounter + pieceSize < fileSize) 
byteq = pieceSize; 

byteq = fileSize - byteCounter; 
out. write (fbuf + byteCounter, byteq); 
cout << "wrote " << byteq << " bytes, "; 
byteCounter += byteq; 
out.closeO; 

cout << "ByteCounter = " << byteCounter 
« ", fileSize = " « fileSize « endl; 
1 
batchFile « " " « argv[l] « endl; 
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Summary 



This chapter has given you a fairly thorough introduction to the iostream class library. In all 
likelihood, it is all you need to create programs using iostreams. (In later chapters you'll see 
simple examples of adding iostream functionality to your own classes.) However, you should 
be aware that there are some additional features in iostreams that are not used often, but which 
you can discover by looking at the iostream header files and by reading your compiler's 



Exercises 



Open a file by creating an ifstream object called in. Make an ostrstream 
object called os, and read the entire contents into the ostrstream using the 
rdbuf() member function. Get the address of os's char* with the str() 
fiinction, and capitalize every character in the file using the Standard C 
toupper( ) macro. Write the result out to a new file, and delete the memory 
allocated by os. 

Create a program that opens a file (the first argument on the command line) 
and searches it for any one of a set of words (the remaining arguments on 

(with line numbers) that match. 

W rite a program that adds a copyright notice to the beginning of all source- 
code files. This is a small modification to exercise I. 
Use your favorite text-searching program (grep, for example) to output the 
names (only) of all the files that contain a particular pattern. Redirect the 
output into a file. Write a program that uses the contents of that file to 
generate a batch file that invokes your editor on each of the files found by 
the search program. 
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3: Templates in 
depth 

Nontype template arguments 

Here is a random number generator class that always produces a unique number and 
overloads operator( ) to produce a familiar function-call syntax: 



//: C03:Urand.h 

// Unique random number generat 

lifndef URAND_H 

Idefine URAND_H 

#include <cstdlib> 

template<int upperBound> 
class Urand { 

int usedlupperBound] ; 
bool recycle; 
public: 

Urand (bool recycle = false); 
int operator!) 0; // The "gen 



template<int upperBound> 
Urand<upperBound>: : Urand (bool recyc) 
: recycle (recyc) { 

memset (used, 0, upperBound *" s i zeof ( int ) ) ; 
srand (time (0) ) ; // Seed random number gene 



nd<upperBound> : : operator ( ) ( ) { 
erachr (used, 0, upperBound) ) { 
recycle) 

eturn -1; // No more spaces lef 



while (used[newval = rand () % upperBo" 

; // Until unique value is found 
used[newval]++; // Set flag 



#endif // URAND_H ///:- 

The uniqueness of Urand is produced by keeping a map of all the numbers possible in the 
random space (the upper bound is set with the template argument) and marking each one off 
as it's used. The optional constructor argument allows you to reuse the numbers once they're 
all used up. Notice that this implementation is optimized for speed by allocating the entire 
map, regardless of how many numbers you're going to need. If you want to optimize for size, 
you can change the underlying implementation so it allocates storage for the map dynamically 
and puts the random numbers themselves in the map rather than flags. Notice that this change 
in implementation will not affect any client code. 

Default template arguments 
The typename keyword 



//: C03 


:Typen 


amedID 


.cpp 




// Usir 


g 'typ 


ename' 


to say 


it' 


// and 


not so 


methin 


g other 


tha 


tempi at 


e<cla3 


3 T> c 


lass X 




// Wi 


thout 


typena 


me, you 


sho 


typer 


ame T 


:id i; 






public 










void 


f 1) { 


i ■ g 1 ) ; 


1 
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public: 

class id I 
public: 

void gl) {} 



int mainl) { 
Y y; 

X<Y> xy; 
xy . f ( ) ; 

} ///:- 

The template definition assumes that the class T that you hand it must have a nested identifier 
of some kind called id. But id could be a member object of T, in which case you can perform 
operations on id directly, but you couldn't "create an object" of "the type id." However, that's 
exactly what is happening here: the identifier id is being treated as if it were actually a nested 
type inside T. In the case of class Y, id is in fad a nested type, but (without the typename 
keyword) the compiler can't know that when it's compiling X. 

If, when it sees an identifier in a template, the compiler has the option of treating that 
identifier as a type or as something other than a type, then it will assume that the identifier 
refers to something other than a type. That is, it will assume that the identifier refers to an 
object (including variables of primitive types), an enumeration or something similar. 
However, it will not — cannot- just assume that it is a type. Thus, the compiler gets confused 
when we pretend it's a type. 

The typename keyword tells the compiler to interpret a particular name as a type. It must be 
used for a name that: 

1 . Is a qualified name, one that is nested within another type. 

2. Depends on a template argument. That is, a template argument is somehow involved in 
the name. The template argument causes the ambiguity when the compiler makes the 
simplest assumption: that the name refers to something other than a type. 

Because the default behavior of the compiler is to assume that a name that fits the above two 
points is not a type, you must use typename even in places where you think that the compiler 
ought to be able to figure out the right way to interpret the name on its own. In the above 
example, when the compiler sees T:;id, it knows (because of the typename keyword) that id 
refers to a nested type and thus it can create an object of that type. 

The short version of the rule is: if your type is a qualified name that involves a template 
argument, you must use typename. 
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Typedefing a typename 

The typename keyword does not automatically create a typedef, A line which reads: 

I typename Seq :: iterator It; 

causes a variable to be declared of type Seq::iterator. If you mean to make a typedef, you 
must say: 

typedef typename Eeq :: iterator It; 

Using typename instead of class 

W illi ite inltoJjclion cltlit Ijpenime keyword, you now have the option of using typename 
instead of class in the template argument list of a template definition. This may produce code 
which is clearer: 

// : CO 3 :UsingTypename.cpp 
template<typename T> class X { 1; 



You'll probably see a great deal of code which does not use typename in this fashion, since 
the keyword was added to the language a relatively long time after templates were introduced. 



Function templates 



t clii III II ID II ilnlj iiMti I III D dig D kD pint ii, is I'l i i? Ii t so ip id. it j g ii ligJ 
I'li trnlig; i 1 1 1 W r g I li i Mlg i ; It il Id d I Id 1 1 tin I d ( t p Mil 1 1 III i )' lit hillij > Itl 
IdtDl Ind. Hi (IntH dji |h d( I fiidlti Itl pint Is I ID [III; fgidloi." However, i 

function template is useful in all sorts of places, as demonstrated in the first example that 

follows. The second example shows a function template used with c 



' See C++ Inside & Out (Osbome/McG raw- Hill, 1993) by the aiilhor. Chapter 10. 
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A string conversion system 



// : C03 :stringConv.h 
// Chuck Allison's st 
lifndef STRINGCONV_H 
Idefine STRINGCONV_H 
#incliide <string> 
#incliide <sstream> 

template<typename T> 
T fromString (const sti 






#endif // STRINGCONV_H ///:- 
Here's a test program, that includes the use of the Standard Library complex number type: 



//: C03:stringConvTest 


cpp 




#include "stringConv . h 






linclude <iostream> 






linclude <complex> 






using namespace std; 






int mainO { 






int i = 1234; 






cout « "i == \"" « 


toString (i) 


« "\"\n"; 


float X = 567.89; 






cout « "x == \"" « 


toString (x) 


« "\"\n"; 


complex<float> cll.O 


2.0); 




cout « "c == \"" « 


toString (c) 


« "\"\n"; 


cout « endl; 
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i = fromString<int> (string ("1234") ) ; 
cout « "i == " « i « endl; 
X = fromString<float>(string ("567.89") ) ; 
cout « "x == " « X « endl; 

c = fromString< complex<f loat> >( string ("( 1 . , 2 . )")) ; 
cout « "c == " « c « endl; 
} ///:- 
The output is what you'd expect: 

== "1234" 
== "567.89" 
= = " (1,2)" 

== 1234 
== 567.89 
== (1,2) 



A memory allocation system 

There ire i U* tb ines yo ii tii do to in ih tbe n* ni en ory ilbcilion routines mallocO, 
calloc( ) and reallf>c( ) safer. The following function template produces one function 
getmeiii( ) that either allocates a new piece of memory or resizes an existing piece (like 
reaUoc( )). In addition, it zeroes only the new memory, and it checks to see that the memory 
is successfully allocated. Also, you only tell it the number of elements of the type you want, 
not the number of bytes, so the possibility of a programmer error is reduced. Here's the 
header file: 

//: C03:Getmem.h 

// Function template for memory 

#ifndef GETMEM_H 

#define GETMEM_H 

#include ".. /require . h" 

#include <cstdlib> 

#include <cstring> 

template<class T> 

void getmem(T'-S oldmem, int elems) { 

typedef int cntr; // Type of element counter 

const int csz = sizeof ( cntr ) ; // And size 

const int tsz = sizeof (T); 

if (elems == ) ( 

free (S ( ( ( cntr *") oldmem) [-1] ) ) ; 
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T* p = oldmem; 

cntr oldcount = 0; 

if Ip) ! // Previously allocated memory 

// Old style: 

// ((cntr*)p)— ; // Back up by one cntr 

// New style: 

cntr* tmp = reinterpret_cast<cntr *> (p) ; 

p = reinterpret_cast<T*>( — tmp ) ; 

oldcount = *(cntr*)p; // Previous # elem 
1 

T* m = (T*) realloc Ip, elems * tsz + csz); 
require (m != 0) ; 
*((cntr*)m) = elems; // Keep track of coun 



= elems - oldc. 



0) I 



// 


Starting 


address 


of data 






lor 


g startadr = (lo 


ng) S lm[o 


Idc 
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startadr += 


csz; 








// 


Zero the 


additio 


nal new 


mem 


r 


men 


set 1 (void*) start 


adr, 0, 


incre 


1 

// Return the 


address 


beyond 


the 




oldmem = (T*) 


1 1 (cntr 


*)m) [1]) 







#endif // GETMEM_H III:- 

To be able to zero only the new memory, a counter indicating the number of elements 
allocated is attached to the beginning of each block of memory. The typedef cntr is the type 
of this counter; it allows you to change fromint to long if you need to handle larger chunks 
(other issues come up when using long, however - these are seen in compiler warnings). 

A pointer reference is used for the argument oldmem because the outside variable (a pointer) 
must be changed to point to the new block of memory, oldmem must point to zero (to allocate 
new memory) or to an existing block of memory that was created with getmem( ). This 
fiinction assumes you're using it properly, but for debugging you could add an additional tag 
next to the counter containing an identifier, and check that identifier in getmem( ) to help 
discover incorrect calls. 
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If the number of elements requested is zero, the storage is freed. There's an additional 
function template freeinein( ) that aliases this behavior. 

You'll notice that getmemO is very low-level - there are lots of casts and byte 
manipulations. For example, the oldmem pointer doesn't point to the true beginning of the 
memory block, but just past the beginning to allow for the counter. So to free( ) the memory 
block, getnieni( ) must back up the pointer by the amount of space occupied by cntr. Because 
oldmein is a T*, it must first be cast to a cntr*, then indexed backwards one place. Finally 
the address of that location is produced for free( ) in the expression: 

I freelS ( ( (cntr *) oldmem) [-1] ) ) ; 

Similarly, if this is previously allocated memory, getmem( ) must back up by one cntr size to 
get the true starting address of the memory, and then extract the previous number of elements. 
The true starting address is required inside realloc( ). If the storage size is being increased, 
then the difference between the new number of elements and the old number is used to 
calculate the starting address and the amount of memory to zero in nieniset( ). Finally, the 
address beyond the count is produced to assign to oldmem in the statement: 

I oldmem = (T*") a ( ( lcntr'-)m) [1] ) ; 

Again, because oldmem is a reference to a pointer, this has the effect of changing the outside 
argument passed to getmem( ). 

Here's a program to test getmem( ). It allocates storage and fills it up with values, then 
increases that amount of storage: 

// : C03 :Getmem.cpp 

// Test memory function template 

#include "Getmem.h" 

#include <io3tream> 
using namespace std; 

int mainl) { 
int* p = D; 
getmem(p, 10) ; 



p[i] = i; 
} 

cout « '\n'; 
ge tmem ( p , 2 0); 
fordnt j - Oi j < 20; j + +) { 

cout « p[j] « ■ ■: 

Pill - Ir 
1 

cout « 'Xn',- 
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getmemlp, 25); 

for lint k = 0; k < 25; k++) 

cout « p[k] « ' '; 
freemem(p); 
cout « '\n'; 

float* f = 0; 
getmem(f, 3); 

for(int u = 0; u < 3; u++) { 
cout « f [u] « • '; 

f (u] = u + 3.14159; 



getmemlf, 6} ; 

for(int V = 0; 

cout « f[v] 



After each getinein( ), the values in memory are printed out to show that the new ones have 
been zeroed. 

Notice that a different version of getinein( ) is instantiated for the int and float pointers. You 
might think that because all the manipulations are so low-level you could get away with a 
single non-template function and pass a void*& as oldmem. This doesn't work because then 
the compOer must do a conversion from your type to a void*. To take the reference, it makes 
a temporary. This produces an error because then you're modifying the temporary pointer, not 
the pointer you want to change. So the function template is necessary to produce the exact 
type for the argument. 

Type induction in function 
templates 

n I lii pii h I '. <\} g>ilgl M3I fit. lOLSiJii ih lillo Id;: 

//: :arraySize.h 

// Uses template type induction to 

// discover the size of an array 

#ifndef ARRAYSIZE_H 

#define ARRAYSIZE_H 
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#endif // ARRAYSIZE_H III:- 

This aclually figures out the size of an array as a compile -time constant value, without using 
any sizeof( ) operations! Thus you can have a much more succinct way to calculate the size of 
an array at compile time: 

// : CO 3 :ArraySize.cpp 

// The return value of the template function 
// asz() is a compile-time constant 
#include ". ./arraySize.h" 

int mainl) { 

int a[12], b[2D]; 

const int sz2 = asz (b); 
int c[szl], d[sz2]; 
) ///:- 

Of course, just making a variable of a built-in type a const does not guarantee it's actually a 
compile-time constant, but if it's used to define the size of an array (as it is in the last line of 
inain( )), then it must be a compile-time constant. 

Taking the address of a 

generated function template 



[Di[)t il'i \i\%\\\i till tli; tllM InUiDD I iih W ;iini 
pi ltd SDi t ( 1)' lg tilt llil V\\\ gl lihissi^: 

I // : C03 :TemplateFunctionAddress . cpp 
// Taking the address of a function . 
// from a template. 

template <typename T> void f(T*) {] 

void h (void l*pf ) lint*) ) { 1 

' ' I am indebled lo Nathan Myers for (his example. 
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void glvoid 


'■pf ) 


T* 


) 


int mainO { 








// Full type 


expo 


3it 


lO 


h(Sf<int>) ; 








// Type indu 


-tion 






h(Sf); 








// Full type 


expo 


Eiit 


lO 


g<int>(Sf<in 


->> ; 






// Type indu 


-tion 






g(Sf<int>) ; 








g<int>(Sf ) ; 








} ///:- 









This example demonstrates a number of different issues. First, even tliougli you're using 
templates, the signatures must match - the function h( ) takes a pointer to a function that takes 
an inl* and returns void, and that's what the template f produces. Second, the function that 
wants the function pointer as an argument can itself be a template, as in the case of the 
template g. 

In main( ) you can see that type induction works here, too. The firet call to h( ) explicitly 
gives the template argument for f, but since h( ) says that it will only take the address of a 
function that takes an int*, that part can be induced by the compiler. With g( ) the situation is 
even more interesting because there are two templates involved. The compiler cannot induce 
the type with nothing to go on, but if either for g is given int, then the rest can be induced. 

Local classes in templates 

Applying a function to an STL 
sequence 

iititqiMt[lipliri;lD[ iti li in jisluii llit fii ilin vector) and apply a function to all 
the objects it contains. Because a vector can contain any type of object, you need a function 
that works with any type of vector and any type of objec 

I // : C03 : apply Sequence.h 

I // Apply a function to an STL sequs 
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// arguments, any type of return 
template<class Seq, class T, class : 
void apply (SeqS sq, R (T: :*£)()) { 

typename Seq :: iterator it = sq.be. 

while(it != sq.endl) ) { 
((*it)->*f) 1); 



// 1 argument, any type of return v, 

void apply (SeqS sq, R(T::'-f) (A), A . 
typename Seq :: iterator it = sq.bei 
while(it != sq.endO ) { 
((*it)->*f) (a); 



// 2 arguments, any type of return value: 
template<class Seq, class T, class R, 

class Al, class A2> 
void applylSeqS sq, RlTii'-f) (Al, A2 ) , 
Al al, A2 a2) { 
typename Seq :: iterator it = sq.beginl); 
while(it != sq.endO ) { 
((*it)->*f) (al, a2) ; 



// Etc., to handle maximum likely arguments ///:- 

The apply( ) function template takes a reference to the container class and a pointer-to- 
member for a member function of the objects contained in the class. It uses an iterator to 
move through the Stack and apply the function to every object. If you've (understandably) 
forgotten the pointer-to-member syntax, you can refresh your memory at the end of Chapter 
XX. 

Notice that there are no STL header files (or any header files, for that matter) included in 
applySequence.h, so it is actually not limited to use with an STL sequence. However, it does 
make assumptions (primarily, the name and behavior of the iterator) that apply to STL 
sequences. 
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You can see there is more than one version of app))'( ), so it's possible to overload function 
templates. Ahhough they all take any type of return value (which is ignored, but the type 
information is required to match the pointer-to-member), each version takes a different 
number of arguments, and because it's a template, those arguments can be of any type. The 
only limitation here is that there's no "super template" to create templates for you; thus you 
must decide how many arguments will ever be required. 

To test the various overloaded versions of apply (), the class Gromit'^ is created containing 
fiinctions with different numbers of arguments: 

//: C03:Gromit.h 

// The techno-dog. Has member functions 
// with various numbers of arguments. 
#include <iostream> 

class Gromit { 

int arf; 
public: 

Gromit (int arf = 1) : arf (arf + 1) {} 
void speaklint) { 

for (int i = 0; i < arf; i + +) 

std: :cout « "arf! " ; 
std::cout « std::endl; 
1 
char eat (float) { 

std : : cout << " chomp ! " << std : : endl ; 



int sleep (char, double) { 

std::cout « "zzz..." « std::endl; 



void sit (void) {1 
(,- ///:- 

Now the apply( ) template functions can be combined with a vecto r<G remit *> to make a 
container that will call member functions of the contained objects, like this: 

I // : C03 :applyGromit.cpp 
// Test applySequence.h 
linclude "Gromit. h" 

I #include "applySequence.h" 



A refej'eiice to llie Brilisli animated sliorl The Wrong Trousers by Nick Park. 
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JUnclude 
#include 
using nai 



dogs .push 


_back( 


new Gror 


nit 


i> >; 


apply (dogs. 


SGrom 


It 


: speak. 


1) ; 


apply (dogs. 


SGrom 


It 


:eat 


2 


Of) ; 


apply (dogs. 


EGrom 


It 


:sle 


P/ 


'z'. 


apply (dogs. 


SGrom 


It 


:sit 






} ///:- 













Although the definition of apply( ) is somewhat complex and not something you'd ever 
expect a novice to understand, its use is remarkably clean and simple, and a novice could 
easily use it knowing only what it is intended to accomplish, not how. This is the type of 
division you should strive for in allof your program components: The tough details are all 
isolated on the designer's side of the wall, and users are concerned only with accomplishing 
their goals, and don't see, know about, or depend on details of the underlying implei 



Template-templates 

// : CO 3 : Tempi ateTempl ate. cpp 

#incliide <iostream> 

#include <string> 
using namespace std; 



// As long as things are simple 
// this approach works fine: 
template<typename C> 
void printl(CS c) { 

typename C::iterator it; 

for(it = c.beginl) ; it != c . e. 

cout « endl; 



// Template-template argume. 
// be a class; cannot use t 
template<typename T, templa 
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id print2 1C<T>6 c) { 
copy (c. begin 1) , c . end ( ) , 

03tream_iterator<T> (cout, " ") ) ; 



int mainl) { 

vector<string> vl5, "Yow!"); 
printl (V) ; 
print2 1v); 

} ///:- 



Member function templates 

ll'i I lio f 1 niH( [0 n 1 1 i apply() a member function template of the class. That is, a separate 
template definition from the class' template, and yet a member of the class. This may produce 
a cleaner syntax: 

I dogs. apply (EGromit: :sit) ; 

This is analogous to the act (in Chapter XX) of bringing ordinary functions inside a class. '^ 

The definition of the apply( ) functions turn out to be cleaner, as well, because they are 
members of the container. To accomplish this, a new container is inherited from one of the 
existing STL sequence containers and the member function templates are added to the new 
type. However, for maximum flexibility we'd like to be able to use any of the STL sequence 
containers, and for this to work a template-template must be used, to tell the compiler that a 
template argument is actually a template, itself, and can thus take a type argument and be 
instantiated. Here is what it looks like after bringing the apply( ) functions into the new type 
as member functions: 

// : CO 3 :applyMember .h 

// applySequence.h modified to use 

// member function templates 

template<class T, template<typename> class Seq> 
class SequenceWithApply : public Seq<T'-> { 
public: 

// arguments, any type of return value: 



Clieck your compiler vejsion iiifbrmalion to see if it supports mejnber func 
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emplate<class R> 

oid apply(R (T::*f) ()) | 

iterator it = begin () ; 

while(it != end() ) | 
1 l*it)->'-f) 1); 



// 1 argument, any type o 

void apply (R(T: :*f ) (A) , A 
iterator it = begin ( ) ; 
whilelit 1= endO ) { 
1 ('■it)->'-f) (a); 



// 2 arguments, any type of ret 
template<class R, class Al , cla 
void apply (R(T: :*f ) (Al, A2 ) , 
Al al, A2 a2) { 

iterator it = begin () ; 
while (it != endl) ) { 
1 l'-it)->*f ) (al, a2) ; 



Because they are members, the apply( ) functions don't need as many arguments, and the 
iterator class doesn't need to be qualified. Also, begin( ) and end( ) are now member 
functions of the new type and so look cleaner as well. However, the basic code is still the 

You can see how the function calls are also simpler for the client programmer: 

// : C03 : apply Gromit2 . cpp 
// Test applyMember .h 
#include "Gromit.h" 
# include "applyMember . h" 
#include <vector> 
#include <iostream> 
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SequenceWithApply<Gromit, vector> dogs ; 
for(int i = 0; i < 5; i++) 

dogs .push_back (new Gromitli)); 
dogs .apply (SGromit :: speak, 1 ) ; 
dogs .apply (SGromit: :eat, 2 . Df } ; 
dogs .apply (SGromit: : sleep, ' z ' , 3.0); 
dogs .apply (SGromit: :sit) ; 
} III:- 
Conceptually, it reads more sensibly to say that you're calling apply( ) for the dogs container. 

Why virtual member template functions 
are disallowed 

Nested template classes 

Template specializations 





Full specialization 


Partial Specialization 


A practical example 


Tlert'i»olli»s 1» prtvtaljoJ froB JsiBg i t liii ten plile ia iiy w ly y o b 'd ^ it i^ crd i» iry 
clisi. Foi (111 pit, you ciD enily iiktil fro a i lei p hie, in J you tji crtile i nt* leii plile 
rliir inslinriires ind iihe til! Iron in e lis ring le in plile . If lb t vector class does everything you 
want, but you'd also like it to sort itself, you can easily reuse the code and add value to it: 




//: C03:Sorted.h 

// Template specialization 

#ifndef SORTED_H 

#define SORTED_H 

#include <vector> 




template<class T> 

class Sorted : public std : : vector<T> ( 

public: 


C 
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emplate<class T> 

oid Sorted<T>: :sort 1) { // A bubble 

fordnt j = 1; j < i; j + +) 

if (at (j-1) > at (j) ) { 

// Swap the two elements: 
T t = at 1 j-1) ; 
atlj-1) = atlj); 
atlj) = t; 



1 



// Partial specialization for pointers: 

teraplate<class T> 

class Sorted<T'-> : public std : : vector<T*> | 

public: 

void sort 1) ; 



emplate<class T> 
oid Sorted<T'->: :sort 1) ! 
for(int i = sizeO; i > D; i- 
fordnt j = 1; j < i; j + +) 

if C-at 1 j-1) > '-at 1 j) ) { 
// Swap the two element 
T*- t = at 1 j-1) ; 
atlj-1) = atlj); 
atlj) = t; 



// Full specialization for char*: 

void Sorted<char*>: :sort I) { 

for(int i = SizeO; i > 0; i — ) 
fordnt j = 1; j < i; j + +) 

if Istrcmp (at I j-1) , atlj)) > 0) { 
// Swap the two elements: 
char*- t = at 1 j-1) ; 
atlj-1) = atlj); 
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Jtendif // SORTED_H III:- 

The Sorted template imposes a restriction on all classes it is instantiated for: They must 
contain a > operator. In SString this is added explicitly, but in Integer the automatic type 
conversion operator int provides a path to the built-in > operator. When a template provides 
more functionality for you, the trade-off is usually that it puts more requirements on your 
class. Sometimes you'll have to inherit the contained class to add the required functionality. 
Notice the value of using an overloaded operator here - the Integer class can rely on its 
underlying implementation to provide the functionality. 

The default Sorted template only works with objects (includingobjectsof built-in types). 
However, it won't sort pointers to objects so the partial specialization is necessary. Even then, 
the code generated by the partial specialization won't sort an array of char* . To solve this, the 
full specialization compares the char* elements using strcnip( ) to produce the proper 
behavior. 

Here's a test for Sorted.h that uses the unique random number generator introduced earlier in 



the chapter 



C03 : Sorted. cpp 
Testing template 
elude "Sorted.h" 
elude "Urand.h" 
elude " . . /arraySi 
elude <io3tream> 



rds[] = { 

"running", "big", "dog" 



rds2[] = { 

, "that", "theothe 



nt mainl) { 
Sorted<int> is; 
Urand<47> rand; 
for (int i = 0; i < 15; i + +) 

is.push_backlrandl) ) ; 
for(int 1 = 0; 1 < is.sizel); 1++) 
cout « is[l] « • '; 
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cout « endl; 

for(int 1 = 0; 1 < is.sizel); 1++) 

cout « is[l] « ' '; 
cout « endl; 

// Uses the template partial specializ 

Sorted<string*> ss; 

for{int i = 0; i < asz (words); i++) 

33.push_back(new string (words [ i ] ) ); 
for(int i = 0; i < ss.sizeO; i + +) 

cout « endl; 



cout « endl; 

// Uses the full char* specialization 

Sorted<char*> scp; 

for(int i = 0; i < asz(words2); i++) 

scp.push_back(words2[i] ); 
for(int i = 0; i < scp.sizeO; i + +) 

cout « scp[i] « ' '; 
cout « endl; 

for(int i = 0; i < scp.sizeO; i + +) 
cout « scpli] « ' '; 



} ///:- 

Each of the template instantiations uses a different version of the template. Sorted<iiit> uses 
the "ordinary," non -specialized template. Sorted<string*> uses the partial specialization for 
pointers. Lastly, Sorted<cliar*> uses the full specialization for char*. Note that without this 
full specialization, you could be fooled into thinking that things were working correctly 
because the words array would still sort out to "a big dog is running" since the partial 
specialization would end up comparing the first character of each array. However, words2 
would not sort out correctly, and for the desired behavior the full specialization is necessary. 
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Pointer specialization 

Partial ordering of function templates 

Design & efficiency 

In Sorted, every time you call add( ) the element is inserted and the array is resorted. Here, 
the horribly inefficient and greatly deprecated (but easy to understand and code) bubble sort is 
used. This is perfectly appropriate, because it's part of the private implementation. During 
program development, your priorities are to 

1. Get the class interfaces correct. 

2. Create an accurate implementation as rapidly as possible so you can: 

3. Prove your design. 

Very often, you will discover problems with the class interface only when you assemble your 
initial "rough draft" of the working system. You may also discover the need for "helper" 
classes like containers and iterators during system assembly and during your first-pass 
implementation. Sometimes it's very difficult to discover these kinds of issues during analysis 
- your goal in analysis should be to get a big-picture design that can be rapidly implemented 
and tested. Only after the design has been proven should you spend the time to flesh it out 
completely and worry about performance issues. If the design fails, or if performance is not a 
problem, the bubble sort is good enough, and you haven't wasted any time. (Of course, the 
ideal solution is to use someone else's sorted container; the Standard C++ template library is 
the first place to look.) 



Preventing template bloat 



inliDe functions). If some of the functionality of a template does not depend on type, it can t 
put in a common base class to prevent needless reproduction of that code. For example, in 
Chapter XX in InheritSlack.cpp inheritance was used to specify the types that a Stack coul 
accept and produce. Here's the templatized version of that code: 

//: C03:Nobloat.h 

// Templatized Inher itStack . cpp 

lifndef NOBLOAT_H 

Idefine NOBLOAT_H 

#include " . . /COA/ Stack4 . h " 

template<class T> 

class NBStack : public Stack { 

public: 
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oid push (T* str) { 
Stack: :push (str) ; 



T*- peekl) const { 

return (T*) Stack: :peekl) ; 



T*- popl) { 

return (T*) Stack: :pop () ; 



-NBStack 1 ) ; 



// Defaults to heap objec 
template<class T> 
NBStack<T> : : -NBStack ( ) { 
T* top = popl) ; 
while (top) { 
delete top; 
top = pop ; 



jfendif // NOBLOAT_H ///:- 

As before, the inline functions generate no code and are thus "free." The functionality is 
provided by creating the base-class code only once. However, the ownership problem has 
been solved here by adding a destructor (which is type -dependent, and thus must be created 
by the template). Here, it defaults to ownership. Notice that when the base-class destructor is 
called, the stack will be empty so no duplicate releases will occur. 

// : CO 3 :NobloatTest.cpp 
#include "Nobloat.h" 
linclude ".. /require . h" 
linclude <fstream> 
#include <iostream> 
#include <string> 
using namespace std; 

int main(int argc, char* argv[]) { 

requireArgs (argc, 1); // File name is argument 

ifstream inlargv[l] ); 

assure(in, argv[l]); 

NBStack<string> textlines; 

string line; 

// Read file and store lines in the stack: 
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while Igetline (in, line)) 

textlines. push (new str ing (1 ine ) ); 
// Pop the lines from the stack and print the 

while((s = (string'-)textlines.pop 1) ) != 0) { 
cout « *s « endl; 
delete s; 



Explicit instantiation 



At times it is useful to explicitly instantiate a template; that is, to tell the compiler to lay down 
the code for a specific version of that template even though you're not creating an object at 
that point. To do this, you reuse the template keyword as follows: 

I template class Bobbin<thread> ; 

template void sort<char> (char * [ ] ) ; 
Here's a version of the Sorled.cpp example that explicitly 



// 


: C03:Explic 


itlnstantiation 


#1 


nclude "Uran 


d.h" 




#1 


nclude "Sorted. h" 




#1 


nclude <iost 


(:eam> 






ing namespac 


B std 




// 


Explicit in 


stant 


ation: 




mplate class 


Sorted<int>; 




t main () { 








Sorted<int> 


is; 






Urand<47> ra 


idl; 






for (int k = 


3; k < 15; k++) 




is.pi]sh_ba 


rklra 


dl 1 ) ) ; 



In this example, the explicit mstantiation doesn't really accomplish anything; the program 
would work the same without it. Explicit instantiation is only for special cases where extra 
control is needed. 



Chapter 15: Multiple Iiiliei 



Explicit specification of template 
functions 

Controlling template 
instantiation 

Normally templates are not instantiated until they are needed. For function templates this just 
means the point at which you call the function, but for class templates it's more granular than 
that: each individual member function of the template is not instantiated until the first point of 
use. This means that only the member functions you actually use will be instantiated, which is 
quite important since it allows greater freedom in what the template can be used with. For 
example: 

// : CO 3 :DelayedInstantiation.cpp 

// Member functions of class templates are not 

// instantiated until they're needed. 

class X ! 

void f 1) {} 



public: 

void gl) {} 



template <typename T> class Z { 

T t; 
public: 

void al) { t.f 1) ; 1 

void b 1) { t.g 1) ; 1 



nt mainl) { 
Z<X> zx; 

zx.aO; // Doesn't create Z<X>::bl) 
Z<Y> zy; 

zy.bl); // Doesn't create Z<Y> : : a () 
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I } III:- 

Here, even though the template purports to use both f( ) and g( ) member functions of T, the 
fact that the program compiles shows you that it only generates Z<X>;:a( ) when it is 
explicitly called for zx (if Z<X>::b( ) were also generated at the same time, a compile-time 
error message would be generated). Similarly, the call tozy.b() doesn't generate Z<Y>::a(). 
As a result, the Z template can be used with X and Y, whereas if all the member functions 
were generated when the class was first created it would significantly limit the use of many 

The inclusion vs. separation models 
The export keyword 

Template programming idioms 

The "curiously-recurring template" 
Traits 

Summary 

(lit gf Ih ;[Mtin t iilitnis if [ i4 It i p IiIm i ill h i shi d li )m > Ii i i ui \\\ lo i lilt 
tg ji Iht DSM Iti f litii. Mpiciill) \\\ cidf {iiI[oa(d ii lit ml ti o iliMi'M, nd \\\\\ 

liil till I ill kt spci tl II iDD h) lit cti pilt[ ) ill ki \\\\\ nt[) I ill lij. 1 llM I I hill 
pi'll Hipl (iltlDUil il ih ip 'itl> 1 HlhiWiii), lid if il'i II) ctnililiDi, [ M 
HI p lit [1 1 1 1 1 II li I II) [dill I li I better about this - previously they would only give the 
line where you tried to instantiate the template, and most of them now go to the line in the 
template definition that caused the problem. 

The issue is that a template implies an interface. That is, even though the template keyword 
says "I'll take any type," the code in a template defmition actually requires that certain 
operators and member functions be supported - that's the interface. So in reality, a template 
definition is saying "I'll take any type that supports this interface." Things would be much 
nicer if the compiler could simply say "hey, this type that you're trying to instantiate the 
template with doesn't support that interface - can't do it." The Java language has a feature 
called interface that would be a perfect match for this (Java, however, has no parameterized 
type mechanism), but it will be many years, if ever, before you will see such a thing in C++ 
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(at this writing tlie C++ Standard has only just been accepted and it will be a while before all 
the compilers even achieve compliance). Compilers can only get so good at reporting 
template instantiation errors, so you'll have to grit your teeth, go to the first line reported as an 
error and figure it out. 
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4: STL Containers 
& Iterators 



Container classes are the solution to a specific kind of code 
reuse problem. They are building blocks used to create 
object-oriented programs - they make the internals of a 
program much easier to construct. 

A container class describes an object that holds other objects. Container classes are so 
important that they were considered fundamental to early object-oriented languages. In 
Smalltalk, for example, programmers think of the language as the program translator together 
with the class library, and a critical part of that library is the container classes. So it became 
natural that C-H- compiler vendors also include a container class library. You'll note that the 
vector was so useful that it was introduced in its simplest form very early in this book. 

Like many other early C-H- libraries, early container class libraries followed Smalltalk's 
ebjecl-based hierarchy, which worked well for Smalltalk, but turned out to be awkward and 
difficult to use in C-i-i-. Another approach was required. 

This chapter attempts to slowly work you into the concepts of the C-l-l- Standard Template 
Library (STL), which is a powerful library of containers (as well as algorithms, but these are 
covered in the following chapter). In the past, 1 have taught that there is a relatively small 
subset of elements and ideas that you need to understand in order to get much of the 
usefulness from the STL. Although this can be true it turns out that understanding the STL 
more deeply is important to gain the full power of the library. This chapter and the next probe 
into the STL containers and algorithms. 

Containers and iterators 

II jgy hi'i hot In i iij d h ji i It )' n '[i ;g Id; It ittj to nlu i p n'in in p n Hi g , »[ 
h) In; th) I ill In I, I til list dn'l tiM li g i li uta Ihii objaii. H gi tii ui hgi 
h) I Dil ifKt to (iMti? ! gg in'l, tjgti itil iglgti itigg isg'l hgv g mil tii ill i. 

lit tgliiloi Ig I git pg Mm i h g h ji cl-o lii g IH iii\<t ma ; flip p i g t: )' o i crtitt iggtlir 



to objects. Of course, you can do the same thing with an array, but there's more. This new 
type of object, which is typically referred to in C++ as a container (also called a collection in 
some languages), will expand itself whenever necessary to accommodate everything you 
place inside it. So you don't need to know how many objects you're going to hold in a 
collection. You just create a collection object and let it take care of the details. 

Fortunately, a good OOP language comes with a set of containers as part of the package. In 
C++, it's the Standard Template Library (STL). In some libraries, a generic container is 
considered good enough for all needs, and in others (C++ in particular) the library has 
different types of containers for different needs: a vector for consistent access to all elements, 
and a linked list for consistent insertion at all elements, for example, so you can choose the 
particular type that fits your needs. These may include sets, queues, hash tables, trees, stacks. 



s have some way to put things in and get things out. The way that you place 
something into a container is fairly obvious. There's a function called "push" or "add" or a 
similar name. Fetching things out of a container is not always as apparent; if it's an array-like 
entity such as a vector, you might be able to use an indexmg operator or function. But in 
many situations this doesn't make sense. Also, a single-selection function is restrictive. What 
if you want to manipulate or compare a group of elements in the container? 

The solution is an iterator, which is an object whose job is to select the elements within a 
container and present them to the user of the iterator. As a class, it also provides a level of 
abstraction. This abstraction can be used to separate the details of the container from the code 
that's accessing that container. The container, via the iterator, is abstracted to be simply a 
sequence. The iterator allows you to traverse that sequence without worrying about the 
underlying structure - that is, whether it's a vector, a linked list, a stack or something else. 
This gives you the flexibility to easily change the underlying data structure without disturbing 
the code in your program. 

From the design standpoint, all you really want is a sequence that can be manipulated to solve 
your problem. If a single type of sequence satisfied all of your needs, there'd be no reason to 
have different kinds. There are two reasons that you need a choice of containers. First, 
containers provide different types of interfaces and external behavior. A stack has a different 
interface and behavior than that of a queue, which is different than that of a set or a list. One 
of these might provide a more flexible solution to your problem than the other. Second, 
different containers have different efficiencies for certain operations. The best example is a 
vector and a list. Both are simple sequences that can have identical interfaces and external 
behaviors. But certain operations can have radically different costs. Randomly accessing 
elements in a vector is a constant-time operation; it takes the same amount of time regardless 
of the element you select. However, in a linked list it is expensive to move through the list to 
randomly select an element, and it takes longer to fmd an element if it is further down the list. 
On the other hand, if you want to msert an element in the middle of a sequence, it's much 
cheaper in a list than in a vector. These and other operations have different efficiencies 
depending upon the underlying structure of the sequence. In the design phase, you might start 
with a list and, when tuning for performance, change to a vector. Because of the absti 
an change from one to the other with minimal impact on your code. 
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In the end, remember that a contamer is only a storage cabinet to put objects in. If that cabinet 
solves all of your needs, it doesn't really matter how it is implemented (a basic concept with 
most types of objects). If you're working in a programming environment that has built-in 
overhead due to other factors, then the cost difference between a vector and a linked list might 
not matter. You might need only one type of sequence. You can even imagine the "perfect" 
container abstraction, which can automatically change its underlying implementation 
according to the way it is used. 

STL reference documentation 

Y 1 w ill nolice liianlils cbaplei dues nci coiiliiD cilii lis live <i o c u u c itilio n dcsnibin; eicli 
of lb e n en bet Id DC lie 1 1 Id eicb STL ct d liiou. A llbougb I il eictibe lb e di eoi ber Id d clio g i 
lli]| I use, I've kfl Ibe Id IN e scrip lion i lo otbers: ibeie ite ilienl Itro inj eood oo-llDe 
soDTces cISTL docDi eililioD is HTM L fo ii il lb il jo i did beep leiiiieiil od jour coi pD ler 
lid view w ilb i W eb btoi set i beDe vet y od Deed lo Itoi loielbioe op. The lirsl is ibe 
D ill II I 1 ire libniy (w b icb cc lers Ibe eDiire SliDdiid C iid Z\i libriiy ) i eotioDed illbe 
besiDDiDj of Ibii book seelioD (pije X X X ). Tbe secoDd ii tbe Ireely-d o w DJt id ib le SG I STL 
nd dt ca I eolilioD, freely dowDloidible il btlp:/fw w w .ij i.co di fT ecli no b; y'S T L J. T bese 
sbould pronde coi plele relereices w b ei yoo're iiiliie code. Id iddilion.lbe STL books 
lisle d is A ppeadii XX w ill p roT id e y on f ilb olb er reso irces. 

The Standard Template Library 

\\t i\\ Sri '^ is a powerful library intended to satisfy the vast bulk of your needs for 
containers and algorithms, but in a completely portable fashion. This means that not only are 
your programs easier to port to other platforms, but that your knowledge itself does not 
depend on the libraries provided by a particular compiler vendor (and the STL is likely to be 
more tested and scrutinized than a particular vendor's library). Thus, ii will benefit you 
greatly to look first to the STL for containers and algorithms, before looking at vendor- 
specific solutions. 

A fiindamental principle of software design is that all problems can be simplified by 
introducing an extra level of indirection. This simplicity is achieved in the STL using 
iterators to perform operations on a data structure while knowing as little as possible about 
that structure, thus producing data structure independence. With the STL, this means that any 
operation that can be performed on an array of objects can also be performed on an STL 
container of objects and vice versa. The STL containers work Just as easily with built-in types 
as they do with user-defmed types. If you learn the library, it will work on everything. 



Contributed to the C++ Standard by Alexander Slepanov and Meng Lee al Hewlett- 
Packard. 
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The drawback to this independence is that you'll have to take a little time at first getting used 
to the way things are done in the STL. However, the STL uses a consistent pattern, so once 
you fit your mind around it, it doesn't change from one STL tool to another. 

Consider an example using the STL set class. A set will allow only one of each object value 
to be inserted into itself. Here is a simple set created to work with ints by providing int as the 
template argument to set: 



// 


C04:Intset.cpp 


// 


Simple use of STL 


#i 


iclude <set> 


#i 


iclude <io3tream> 



for (int i 


0; i 


< 


25 


i + +) 






for (int 


- 0; 


J 


< 


10; j++ 






// Try 


to in 




nultiplE 


cop 


ies 


intset 


inser 


t( 


); 








// Print to outp 


(It 










copy (intset 


begin 


1) 


i 


^tset.e^ 


dl). 




ostream_iterato 


r< 


nt 


> (cout. 


"\n' 


) ) ; 


} ///:- 















The insert( ) member does all the work: it tries putting the new element in and rejects it if it' s 
already there. Very often the activities involved in using a set are simply insertion and a test 
to see whether it contains the element. You can also form a union, intersection, or difference 
of sets, and test to see if one set is a subset of another. 

In this example, the values 0-9 are inserted into the set 25 times, and the results are printed 
out to show that only one of each of the values is actually retained in the set. 

The copy( ) function is actually the instantiation of an STL template function, of which there 
are many. These template functions are generally referred to as "the STL Algorithms" and 
will be the subject of the following chapter. However, several of the algorithms are so useful 
that they will be introduced in this chapter. Here, copy( ) shows the use of iterators. The set 
member functions begin( ) and end( ) produce iterators as their return values. These are used 
by copy( ) as beginning and ending points for its operation, which is simply to move between 
the boundaries established by the iterators and copy the elements to the third argument, which 
is also an iterator, but in this case, a special type created for iostreams. This places int objects 
on cout and separates them with a newline. 

Because of its genericity, copy( ) is certainly not restricted to printing on a stream. It can be 
used in virtually any situation: it needs only three iterators to talk to. All of the algorithms 
follow the form of copy( ) and simply manipulate iterators (the use of iterators is the "extra 
level of indirection"). 
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Now consider taking the form of Iiitset.cpp and reshaping it to display a list of the words 
used in a document. The solution becomes remarkably simple. 

// : CO 4 :WordSet.cpp 
#include ".. /require . h" 
#include <3tring> 
#include <fstream> 
#incl(ide <io3tream> 
#incliide <set> 

int main(int argc, char* argv[]) { 
requireArgs (argc, 1) ; 
ifstream source (argv [ 1 ]) ; 
assure (source, argv[l]); 
string word; 
set<string> words; 
while (source » word) 

words. insert (word); 
copy (words. beginO , words. endl), 

ostream_iterator<string> (cout, "\n") ) ; 
cout << "Number of unique words:" 

« words. sizeO « endl; 
} ///:- 

The only substantive difference here is that string is used instead of int. The words are pulled 
from a file, but everything else is the same as in Inlset.cpp. The operator» returns a 
whitespace -separated group of characters each time it is called, until there's no more input 
ftwm the file. So it approximately breaks an input stream up into words. Each string is placed 
in the set using insert(), and the copy() function is used to display the results. Because of the 
way set is implemented (as a tree), the words are automatically sorted. 

Consider how much effort it would be to accomplish the same task in C, or even in C++ 
without the STL. 



The basic concepts 



I I ( p rii 11) iJ ( 1 ii ll [ S r L ii li e container (aUo known as a collection), which is just what 
it sounds like: a place to hold things. You need containers because objects are constantly 
marching in and out of your program and there must be someplace to put them while they're 
around. You can't make named local objects because in a typical program you don't know 
how many, or what type, or the lifetime of the objects you're working with. So you need a 
container that will expand whenever necessary to fill your needs. 
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All the containers in tlie STL hold objects and expand themselves. In addition, they hold your 
objects in a particular way. The difference between one container and another is the way the 
objects are held and how the sequence is created. Let's start by looking at the simplest 



A vector is a linear sequence that allows rapid random access to its elements. However, it's 
expensive to insert an element in the middle of the sequence, and is also expensive when it 
allocates additional storage. A deque is also a linear sequence, and it allows random access 
that's nearly as fast as vector, but it's significantly faster when it needs to allocate new 
storage, and you can easily add new elements at either end (vector only allows the addition of 
elements at its tail). A list the third type of basic linear sequence, but it's expensive to move 
around randomly and cheap to insert an element in the middle. Thus list, deque and vector 
are very similar in their basic functionality (they all hold linear sequences), but different in the 
cost of their activities. So for your first shot at a program, you could choose any one, and only 
experiment with the others if you're tuning for efficiency. 

Many of the problems you set out to solve will only require a simple linear sequence like a 
vector, deque or list. All three have a member function push_back( ) which you use to insert 
a new element at the back of the sequence (deque and list also have push_froiit{ )). 

But now how do you retrieve those elements? With a vector or deque, it is possible to use the 
indexing operator [ ],but that doesn't work with list. Since it would be nicest to learn a single 
interface, we'll often use the one defined for all STL ci 



An iterator is a class that abstracts the process of moving through a sequence. It allows you tc 

select each element of a sequence without knowing the underlying structure of that sequence. 
This is a powerful feature, partly because it allows us to learn a single interface that works 
with all containers, and partly because it allows containers to be used interchangeably. 



One more observation and you're ready for another example. Even though the STL ci 
hold objects by value (that is, they hold the whole object inside themselves) that' s probably 
not the way you'll generally use them if you're doing object-oriented programming. That's 
because in OOP, most of the time you'll create objects on the heap with new and then upcast 
the address to the base-class type, later manipulating it as a pointer to the base class. The 
beauty of this is that you don't worry about the specific type of object you're dealing with, 
which greatly reduces the complexity of your code and increases the maintainability of your 
program. This process of upcasting is what you try to do in OOP with polymorphism, so 
you'll usually be using containers of pointers. 

Consider the classic "shape" example where shapes have a set of common operations, and you 
have different types of shapes. Here's what it looks like using the STL vector to hold pointers 
to various types of Shape created on the heap: 

// : CO 4 :Stl shape. cpp 
// Simple shapes w/ STL 
#include <vector> 
#include <iostream> 
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tual void drawl) = 0; 
tual -Shape () { 1 ; 



class Circle : public Shape { 
public: 

void drawl) ! c 

-Circle 1) ! cor 



"Circle: :draw\n"; ) 
~Circle\n"; } 



class Triangle : public Shape { 
public: 

void drawl) I cout « 

-Triangle 1) { cout « 



ngle: :draw\n"; 
angle\n"; } 



class Square : public Shape { 

public: 

void drawl) ! cout « " Square :: draw\n" ; ) 
-Square 1) { cout « "-Square\n" ; } 



typedef std : : vector<Shape*> Conta 
typedef Container :: iterator Iter; 

int mainl) { 

Container shapes; 
shapes. push_back(new Circle); 
shapes . push_back (new Square ) ; 
shapes .push_back (new Triangle); 
forllter i = shapes . begin 1 ) ; 
i != shapes. endl); i++) 
l*i)->draw(); 
// ... Sometime later: 
forllter j = shapes . begin 1 ) ; 
j != shapes. endO; j + +) 
delete *j; 
} ///:- 
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The creation of Shape, Circle, Square and Triangle should be fahly familiar. Shape is a 
pure abstract base class (because of the;iH/-ejpeci/!e;=0) that defines the interface for all 
types of shapes. The derived classes redefine the virtual function draw( ) to perform the 
appropriate operation. Now we'd like to create a bunch of different types of Shape object, but 
where to put them? In an STL container, of course. For convenience, this typedef: 

I typedef std : : vector<Shape*> Container; 
creates an alias for a vector of Shape*, and this typedef: 

I typedef Container :: iterator Iter; 
uses that alias to create another one, for vector<Shape*>::iterator. Notice that the container 
type name must be used to produce the appropriate iterator, which is defined as a nested class. 
Although there are different types of iterators (forward, bidirectional, reverse, etc., which will 
be explained later) they all have the same basic interface: you can increment them with ++, 
you can dereference them to produce the object they're currently selecting, and you can test 
them to see if they're at the end of the sequence. That's what you'll want to do 90% of the 
time. And that's what is done in the above example: after creating a container, it's filled with 
different types of Shape*. Notice that the upcast happens as the Circle, Square or Rectangle 
pointer is added to the shapes container, which doesn't know about those specific types but 
instead holds only Shape*. So as soon as the pointer is added to the container it loses its 
specific identity and becomes an anonymous Shape*. This is exactly what we want: toss them 
all in and let polymorphism sort it out. 

The first for loop creates an iterator and sets it to the beginning of the sequence by calling the 
begin( ) member function for the container. All containers have begEn( ) and end( ) member 
functions that produce an iterator selecting, respectively, the beginning of the sequence and 
one past the end of the sequence. To lest to see if you're done, you make sure you're != to the 
iterator produced by end( ). Not < or <=. The only test that works is !=. So it's very common 
to write a loop like: 

I fordter i = shapes . begin () ; i != shapes . end () ; i + +) 

This says: "take me through every element in the sequence." 

What do you do with the iterator to produce the element it's selecting? You dereference it 
using (what else) the '*' (which is actually an overloaded operator). What you get back is 
whatever the container is holding. This container holds Shape*, so that's what *i produces. If 
you want to send a message to the Shape, you must select that message with ->, so you write 
the line: 

I (*i)->drawl); 

This calls the draw( ) function for the Shape* the iterator is currently selecting. The 
parentheses are ugly but necessary to produce the proper order of evaluation. As an 
alternative, operator-> is defined so that you can say: 
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As they are destroyed or in other cases where the pointers are removed, the STL o 
not call delete for the pointers they contain. If you create an object on the heap with new anc 
place its pointer m a container, the contamer can't tell if that pointer is also placed inside 
another container. So the STL just doesn't do anything about it, and puts the responsibility 
squarely in your lap. The last lines in the program move through and delete every object in tl 
o proper cleanup occurs. 



It's very interesting to note that you can change the type of container that this program uses 
with two lines. Instead of including <vector>, you include <lisl>, and in the first typedef you 

I typedef std : : 1 i 3t<Shape'-> Container; 

instead of using a vector. Everything else goes untouched. This is possible not because of an 
interface enforced by inheritance (there isn't any inheritance in the STL, which comes as a 
surprise when you first see it), but because the interface is enforced by a convention adopted 
by the designers of the STL, precisely so you could perform this kind of interchange. Now 
you can easily switch between vector and list and see which one works fastest for your needs. 



Containers of strings 



i I 1 1 ( p [ i 1 [ ( 1 1 J f I ( . s [ 1 1 ( 1 1 i 1 ; inain( ), it was necessary to move through the whole list 
and delete all the Shape pointers. 

or (Iter j = shapes . begin () ; 
j != shapes. endl) ; j++) 
delete ^ j ; 

This highlights what could be seen as a flaw in the STL: there's no facility in any of the STL 
containers to automatically delete the pointers they contain, so you must do it by hand. It's as 
if the assumption of the STL designers was that containers of pointers weren't an interesting 
problem, although I assert that it is one of the more common things you'll want to do. 

Automatically deleting a pointer turns out to be a rather aggressive thing to do because of the 
multiple membership problem. If a container holds a pointer to an object, it's not unlikely that 
pointer could also be in another container. A pointer to an Aluminum object in a list of Trash 
pointers could also reside in a list of Aluminum pointers. If that happens, which list is 
responsible for cleaning up that object - that is, which list "owns" the object? 

This question is virtually eliminated if the object rather than a pointer resides in the list. Then 
it seems clear that when the list is destroyed, the objects it contains must also be destroyed. 
Here, the STL shines, as you can see when creating a container of string objects. The 
following example stores each incoming line as a string in a vector<string>: 

I // : C04 :StringVector .cpp 

// A vector of strings 
#include " . . /requi re . h" 
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^include <string> 

#include <fstream> 
linclude <io3tream> 
linclude <iterator> 
linclude <33tream> 
using namespace std; 

int main lint argc, char*" argv [ ] ) { 
requireArgs (argc, 1) ; 
ifstream inlargv[l] ) ; 
assure (in, argv[l]); 

string line; 

while (getline (in, line)) 

strings. push_back (line) ; 
// Do something to the strings... 
int i = 1; 

vector<string>: [iterator w; 
for(w = strings. begin 1) ; 

w <= strings. endl) ; w++) { 



1 

// Now send them out: 

copy (strings .begin () , strings .end () , 

ostream_iterator<string>(cout, "\n") ) ; 
// Since they aren't pointers, string 
// objects clean themselves up! 
} ///:- 

Once the vector<string> called strings is created, each line in the file is read into a string 
and put in the vector: 

I while (getline (in, line)) 

The operation that's being performed on this file is to add line numbers. A stringstream 
provides easy conversion from an int to a string of characters representing that int. 

Assembling string objects is quite easy, since operate r+ is overloaded. Sensibly enough, the 
iterator w can be dereferenced to produce a string that can be used as both an rvalue and an 
lvalue: 
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The fact that you can assign back into 

at first, but it's a tribute to the careful design of the STL. 

Because the vector<string> contains the objects themselves, a number of interesting things 
take place. First, no cleanup is necessary. Even if you were to put addresses of the string 
objects as pointers into other containers, it's clear that strings is the "master list" and 
maintains ownership of the objects. 

Second, you are effectively using dynamic object creation, and yet you never use new or 
delete! That's because, somehow, it's all taken care of for you by the vector (this is non- 
trivial. You can try to figure it out by looking at the header files for the STL — all the code is 
there — but it's quite an exercise). Thus your coding is significantly cleaned up. 



The limitation of holding objects instead of pointers inside containers is quite severe: you 
can't upcast from derived types, thus you can't use polymorphism. The problem with 
upcasting objects by value is that they get sliced and converted until their type is completely 
changed into the base type, and there's no rejnnant of the derived type left. It's pretty safe to 
say that you never want to do this. 

Inheriting from STL containers 



iH I lilii; ilhii oil to jia. It I i;li M > ;ll iii( ilir [MttignHi[i ID StringVector.cpp 
and package it into a class for later reuse. 

Now the question is: do you create a member object of type vector, or do you inherit? A 
general guideline is to always prefer composition (member objects) over inheritance, but with 
the STL this is often not true, because there are so many existing algorithms that work with 
the STL types that you may want your new type to be an STL type. So the list of strings 
should also be a vector, thus inheritance is desired. 

I //: C04:FileEditor.h 

// File editor tool 
I #ifndef FILEEDITOR_H 

#define FILEEDITOR_H 

#incl(ide <vector> 
#incl(ide <iostream> 

class FileEditor : 

public std: :vector<std: :string> { 

public: 

FileEditor (char* filename); 
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#endif // FILEEDITOR_H III:- 

Note the careful avoidance of a global using namespace std statement here, to prevent the 
opening of the std namespace to every file that includes this header. 

The constructor opens the file and reads it into the FileEditor, and write( ) puts the vector of 
string onto any ostream. Notice in write( ) that you can have a default argument for a 
reference. 



implem 


ntal 


on 


s quite simple 






// 


C04:E 


ileEditor .cpp 


|0) 


#ir 


cl 


ide 


"F 


leEditor 


h 




#ir 


cl 


ide 




/require 


h 




#ir 


cl 


ide 


<f 


tream> 







using namespace std; 

FileEditor: : FileEditor (char 
ifstream in (filename) ; 
assure (in, filename); 
string line; 

while (getline (in, line)) 
pu3h_back(line) ; 



} 



e copy 1 ) her 
:write (ostre 
= begin () ; 



The functions from StringVector.cpp are simply repackaged. Often this is the way classes 
evolve - you start by creating a program to solve a particular application, then discover some 
inly-used functionality within the program that can be turned into a class. 

e line numbering program can now 

// : CO 4 :FEditTest.cpp 
//{L} FileEditor 
// Test the FileEditor 
#include "FileEditor . h 
#include ".. /require . h 
#include <sstream> 
using namespace std; 
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nt main lint argc, char* argv[]) { 
requireArgs (argc, 1) ; 
FileEditor f lie (argv [ 1 ] ) ; 
// Do something to the lines... 
int i = l; 
FileEditor : :iterator w = file. beg 
while(w != file.endl) ) { 



// Now send them to cout : 
file.writeO ; 
( ///:- 
Now the operation of read ing the file is in the constructor: 

I FileEditor f ile ( argv [ 1 ] ) ; 
and writing happens in the single line (which defaults to sending the output to cout): 

I file.writel) ; 
The bulk of the program is involved with actually modifying the file in memory. 

A plethora of iterators 

and to work with different types of containers without knowing the underlying structure of 
those containers. Every container produces iterators. You must always be able to say: 

I ContainerType: :iterator 

I ContainerType: :const_iterator 

to produce the types of the iterators produced by that container. Every container has a begiD( ) 
method that produces an iterator indicating the beginning of the elements in the container, and 
an eiid( ) method that produces an iterator which is the as the past-the-end value of the 
container. If the container is const, begin( ) and end( ) produce const iterators. 

Every iterator can be moved forward to the next element using the operator++ (an iterator 
may be able to do more than this, as you shall see, but it must at least support forward 
It with ope rate r++). 

c iterator is only guaranteed to be able to perform == and != comparisons. Thus, to 
iterator it forward without running it off the end you say something like: 
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Where pastEnd is the past-the-end value produced by the 
function. 

An iterator can be used to produce the element that it is currently selecting within a container 
by dereferencing the iterator. This can take two forms. If it is an iterator and f() is a member 
function of the objects held in the container that the iterator is pointing within, then you can 
say either: 

I C-it) .f 1) ; 



Knowing this, you can create a template that works with any container. Here, the apply( ) 
function template calls a member function for every object in the container, using a pointer t( 
member that is passed as an argument: 



// 


: C04 :Apply .cpp 






// 


Using basic iterators 






#1 


nclude <iostream> 






#1 


nclude <vector> 






#1 


nclude <iterator> 

ing namespace std; 






template<class Cont, class 


PtrMemFun> 


vc 


id apply (ContS c, PtrMemFun 


f) { 




typename Cont :: iterator 


t 


c. begin () ; 




whiledt != c.endO ) { 








lit->*f ) 1) ; // Compact 


form 




1 C-it) .--f ) 1) ; // Alter 


late form 



class Z ! 

public: 

Z(int ii) : ilii) {1 

void g() I i++; 1 

friend ostreamS 

operator« (ostreamS os, const ZS z) { 
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int mainl) { 

03tream_iterator<Z> outlcout, " " ) ; 

vector<Z> vz ; 

for(int i = 0; i < 10; i++) 
vz.pu3h_back(Z (i) ) ; 

copy (vz. begin , vz . end ( ) , out); 
1 cout « endl; 

apply (vz, SZ: :g) ; 

copy (vz. begin 1) , vz . end ( ) , out); 
} ///:- 

Because openitor-> is defined for STL iterators, it can be used for poiiiter-to -member 
dereferencing (in the following chapter you'll learn a more elegant way to handle the problei 
of applying a member function or ordinary function to every object in a container). 



Much of the time, this is all you need to know about iterators - that they are produced by 
begiD( ) and eiid( ), and that you can use them to move through a container and select 
elements. Many of the problems that you solve, and the STL algorithms (covered in the next 
chapter) will allow you to just flail away with the basics of iterators. However, things can at 
times become more subtle, and in those cases you need to know more about iterators. The rest 
of this section gives you the details. 

Iterators in reversible containers 

AllcaiiiiiDtrs itisipjodiift lie tisic iterator. A container may also be reversible, which 
means that it can produce iterators that move backwards from the end, as well as the iterators 
that move forward from the beginning. 

A reversible container has the methods rbegin( ) (to produce a reversejterator selecting the 
end) and rend() (to produce a reverse_iterator indicating "one past the beginning"). If the 
container is const then rbegin( ) and rend( ) will produce const_reverse_ilerators. 

All the basic sequence containers vector, deque and list are reversible 
following example uses vector, but will work with deque and list as wi 

// : CO 4 :Reversible.cpp 

// Using reversible containers 

#include ".. /require . h" 

#includ.e <vector> 

#include <iostream> 

#include <fstream> 

#include <3tring> 
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int mainO | 

if stream in ( "Reversible . cpp" ) ; 
assure (in, "Reversible . cpp" ) ; 
string line; 
vector<string> lines; 
while (getline (in, line)) 
lines. push_back (line) ; 
vector<string>: : reverse_iterator r; 
for(r = lines.rbeginO; r != lines. rendl); r + +) 

} ///:- 

You move backward through the container using the same syntax as moving forward through 
a container with an ordinary iterator. 

The associative containers set, multiset, map and multimap are also reversible. Using 

s with associative containers is a bit d ifferent, however, and will be delayed until those 
more fully introduced. 



Iterator categories 



Input: read-only, one pass 

The only predefined implementations of input iterators are istream_iterator and 
istreambuf_iterator, to read from an istream. As you can imagine, an input iterator can only 
be dereferenced once for each element that's selected, just as you can only read a particular 
portion of an input stream once. They can only move forward. There is a special constructor 
to define the past-the-end value. In summary, you can dereference it for reading (once only 
for each value), and move it forward. 



Output: write-only, one pass 



This is the complement of an input iterator, but for * riting rather than reading. The only 
predefined implementations of output iterators are ostream_iterator and 
ostreambuf_iteFator. to write to an ostream. and the less-commonly-used 
raw_storage_ite rater. Again, these can only be dereferenced once for each written value, 
and they can only move forward. There is no concept of a terminal past-the-end value for an 
output iterator. Summarizing, you can dereference it for writing (once only for each value) 
and move it forward. 
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Forward: multiple read/write 



The forward iterator contains all the functionality of both the input iterator and the output 
iterator, plus you can dereference an iterator location muhiple times, so you can read and 
write to a value muhiple times. As the name implies, you can only move forward. There are 
no predefined iterators that are only forward it 



Bidirectional: operator- 



II the functionality of the forward i 
ion at a lime using operator—. 



Random-access: like a pointer 



Finally, the random -access iterator has all the funclionalily of the hid irectional iterator plus all 
the functionality of a pointer (a pointer i.; a random-access iterator). Basically, anything you 
can do with a pointer you can do with a random -access iterator, including indexing with 
operator[ ], adding integral values to a pointer to move it forward or backward by a number 
of locations, and comparing one iterator to another with <, >=, etc. 



Is this really important? 



W hy do you care about this categorization? W hen you're just using containers in a 
straightforward way (for esample, just hand -coding all the operations you wantto perform on 
the objects in the container) it usually doesn't impact you too much. Things either work or 

1. You use some of the fancier built-in iterator types that will be demonstrated shortly. Or 

chapter). 

2. You use the STL algorithms (the subject of the nest chapter). Each of the algorithms have 
requirements that they place on the iterators that they work w ith. K now ledge of the 
iterator categories is even more important when you create your own reusable algorithm 

flexible the algorithm w ill be. If you only require the most prim ilive iterator category 
(input or output) then your algorithm willworkwith everything (copy( ) is an example of 
this). 



Predefined iterators 



Tie STL liis 1 prHifiiKJ set o f iieriior t li ssi s It ii c id t e q t ile li iml j . f or ei ira p k , y o b 'v 
aire t d )' seeo reversejterator (produced by calling rbegin( ) and rend( ) for all the basic 
containers). 

The insertion iterators are necessary because some of the STL algorithms - copy( ) for 
example - use the assignment operator= in order to place objects in the destinat 
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This is a problem wlien you're using the algorithm \ofill the container rather thar 
items that are already in the destination container. That is, when the space isn't already there. 
What the msert iterators do is change the implementation of the operator= so that instead of 
doing an assignment, it calls a "push" or "insert" function for that container, thus causing it to 
allocate new space. The constructors for both back_iiisert_iterator and 
front_insert_iteralor take a basic sequence container object (vector, deque or list) as their 
argument and produce an iterator that calls push_back( ) or push_front( ), respectively, to 
perform assignment. The shorthand functions back_inserter( ) and front_inserter( ) produce 
the same objects with a little less typing. Since all the basic sequence containers support 
push_back( ), you will probably find yourself using back_inserter( ) with some regularity. 

The insert_iterator allows you to insert elements in the middle of the sequence, again 
replacing the meaning of operator=, but this time with iiisert() instead of one of the "push" 
functions. The insert( ) member function requires an iterator indicating the place to insert 
before, so the insert_iterator requires this iterator in addition to the container object. The 
shorthand function inserter( ) produces the same object. 

The following example shows the use of the different types of inserters: 





C04:Inserters 


cpp 




Diffe 


ent type 


of 




icl 


ide 


<iostrear 


> 




icl 


ide 


<vector> 






icl 


ide 


<deque> 






ici 


jde 


<list> 








jde 


<iterator> 



int a[] = { 1, 3, 5, 7, 11, 13, 17, IS, 23 

template<class Cont> 

void frontlnsertion (ContS c±) { 

copy {a, a + sizeof (a) /sizeof (int ) , 
front_in3erter{ci)); 

copy (ci .begin () , ci .end ( ) , 

cout « endl; 
} 

template<class Cont> 

void backlnsertion (ContS ci) { 

copy(a, a + sizeof (a ) /si zeof (int ) , 

back_inserter(ci)); 
copy (ci. begin 1) , ci . end () , 

ostream_iterator<int> (cout, " ") ) ; 
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emplate<class Cont> 
oid midlnsertion (Contfi ci) { 
typename Cont :: iterator it = ci.beginl); 

copy(a, a + sizeof (a ) / ( si zeof ( int ) * 2), 

inserter (ci, it)); 
copy (ci.beginO , ci . end ( ) , 

ostream_iterator<int> (cout, " ") ) ; 
cout « endl; 



nt mainO | 
deque<int> di; 



frontlns. 
di. clear 
li. clear 



backlnsertion(li) ; 

midlnsertion (di) ; 
midlnsertion(li) ; 

} ///:- 

Since vector does not support push_front( ), it cannot produce a front_insertion_iterator. 

However, you can see thai vector does support the other two types of insertion (even though, 
as you shall see later, insert( ) is not a very efficient operation for vector). 

lO stream iterators 

You've already seen some use of Ihe ostreamjterator (an output iterator) in conjunction 
with copy()io place the contents of a container on an output stream. There is a corresponding 
istreain_iterator (an input iterator) which allows you to "iterate" a set of objects of a 
specified type from an input stream. An important difference between ostreani_iterator and 
is(reain_iterator comes from the fact that an output stream doesn't have any concept of an 
"end," since you can always just keep writing more elements. However, an input stream 
eventually terminates (for example, when you reach the end of a file) so there needs to be a 
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way to represent that. An istreainjterator has two constructors, one that takes an istream 
and produces the iterator you actually read from, and the other which is the default 
constructor and produces an object which is the past-the-end sentinel. In the following 
program this object is named end: 

t.cpp 





C04:Str 


2a 




Iterator 






icl 


ide 




/ 




icl 


ide 


<ios 




icl 


ide 


<f 


t 




icl 


ide 


<v 


iC 






jde 


<str 



mit.cpp"); 
It.cpp"); 
tring> initlin). 



opy (ii 
opy (v, 



When in runs out of input (in this case when the end of the file is reached) then init becomes 
equivalent to end and the copy( ) terminates. 

Because out is an ostream_iterator<string>, you can simply assign any string object to the 
dereferenced iterator using operator= and that string will be placed on the output stream, as 
seen in the two assignments to out. Because out is defined with a newline as its second 
argument, these assignments also cause a newline to be inserted along with each assignment. 

While it is possible to create an istream_iterator<cliar> and ostreani_iterator<cliar>, these 
actually parse the input and thus will for example automatically eat whitespace (spaces, tabs 
and newlines), which is not desirable if you want to manipulate an exact representation of an 
istream. Instead, you can use the special iterators istreanibuf_iterator and 
ostreanibuf_iterator. which are designed strictly to move characters'^. Although these are 



These were actually created to abslract the "locale" facets away from ic 
locale facets could operate on any sequence of characters, not only lostreams. Locales allow 
isily handle culturally -different formatting (such as representation of money), 
e beyoiid the scope of this book. 
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templates, the only template arguments they will accept are either char or wchar_t 
characters). The following example allows you to compare the behavior of the stre: 
vs. the streambuf it 



//: C04:Sti 


eambuf Iterator . cpp 






// istreambuf_iterator S ostr 


samb 


Lif_iter 




nclude " 


. /require . h" 








nclude <i 


03tream> 








nclude <fstream> 








nclude <iterator> 








nclude <algorithm> 






" 


ing name 


pace std; 






i 


t main () 










ifstream 


in ("Streambufltera 


tor. 


cpp") ; 




assure (ir 


, "Streambuf I terator.c 


pp") ; 




// Exact 


representation of 


3tre 






istreambL 


f_iterator<char> i 


3b (i 


n), end 




ostreambL 


f_iterator<char> o 


3b (C 


out); 




while (isb != end) 








*osb++ 


= *isb++; // Copy 


in' 


to cou 




cout « e 


ndl; 








ifstream 


in2 ("Streambuflter 


ator 


cpp") ; 




// Strip. 


white space: 








istream_iterator<char> i3(i 


-12), 


end2 ; 



} ///:- 



The stream iterators use the parsing defined by istreani::operator», which is probably not 
what you want if you are parsing characters directly - it's fairly rare that you would want all 
the whitespace stripped out of your character stream. You'll virtually always want to use a 
streambuf iterator when using characters and streams, rather than a stream iterator. In 
addition, istreani::operator» adds significant overhead for each operation, so it is only 
appropriate for higher-level operations such as parsing floating-point numbers.'^ 



I am indebted to Nathan Myers for explaining this 
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Manipulating raw storage 



This is a little more esoteric and is generally used in the implementation of other Standard 
Library functions, but it is nonetheless interesting. The raw_storage_iterator is defined in 
<algorilhni> and is an output iterator. It is provided to enable algorithms to store their results 
into uninitialized memory. The interface is quite simple: the constructor takes an output 
iterator that is pointing to the raw memory (thus it is typically a pointer) and the operator= 
assigns an object into that raw memory. The template parameters are the type of the output 
iterator pointing to the raw storage, and the type of object that will be stored. Here's an 
example which creates Noisy objects (you'll be introduced to the Noisy class shortly; it's not 
necessary to know its details for this example): 

// : CO 4 : Raws tor agelterator.cpp 

#include "Noisy. h" 
#include <iostream> 
#include <iterator> 
# include <algorithm> 

using namespace std; 

int mainl) { 

// Create raw storage and cast to desired type: 
Noisy* np = 

(Noisy*)new char [quantity * sizeof (Noisy )] ; 
raw_storage_iterator<Noisy * , Noisy> rsi (np) ; 
for (int i = 0; i < quantity; i + +) 

*rsi + + = Noisy 1); // Place objects in storage 
cout « endl; 
copy(np, np + quantity, 

ostream_iterator<Noisy> (cout, " " ) ) ; 
cout « endl; 

// Explicit destructor call for cleanup: 
for (int j = 0; j < quantity; j + +) 

(Snp[ j] )->~Noisy () ; 
// Release raw storage: 
delete (char*)np; 
} ///:- 

To make the raw_storage_iterator template happy, the raw storage must be of the same type 
as the objects you're creating. That's why the pointer from the new array of char is cast to a 
Noisy*. The assignment operator forces the objects into the raw storage using the copy- 
constructor. Note that the explicit destructor call must be made for proper cleanup, and this 
also allows the objects to be deleted one at a time during container manipulation. 
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Basic sequences: 

vector, list & deque 



If you take a step back from the STL containers you'll see that there are really only two types 
r: sequences (including vector, list, deque, stack, queue, and priority_ queue) 
s (including set, multiset, map and multimap). The sequences keep the 
objects in whatever sequence that you establish (either by pushing the objects on the end or 
inserting them in the middle). 

Since all the sequence containers have the same basic goal (to maintain your order) they seem 
relatively interchangeable. However, they differ in the efficiency of their operations, so if you 
are going to manipulate a sequence in a particular fashion you can choose the appropriate 
container for those types of manipulations. The "basic" sequence containers are vector, list 
and deque — these actually have fleshed-out implementations, while stack, queue and 
priori ty_queue are built on top of the basic sequences, and represent more specialized uses 
rather than differences in underlying structure (stack, for example, can be implemented using 
a deque, vector or list). 

So far in this book I have been using vector as a catch-all container. This was acceptable 
because I've only used the simplest and safest operations, primarily push_back( ) and 
operator[ ]. However, when you start making more sophisticated uses of containers it 
becomes important to know more about their underlying implementations and behavior, so 
you can make the right choices {and, as you'll see, stay out of trouble). 



Basic sequence operations 



D une I ttffl plitt. tlie foilct Id; tni pie she » s tli t c p t ntioii i 111 il ill 111 e bisic stqytncM 
I vector, deque or list) support. As you shall learn in the sections on the specific sequence 
containers, not all of these operations make sense for each basic sequence, but they are 
supported. 



//: C04:BasicSequenceOperations 


cpp 


// The operations available for 


all th 


// basic sequence Containers. 




#include <iostream> 




#include <vector> 




#include <deque> 




#incliide <list> 




using namespace std; 




template<typename Container> 




void print (Containers c, char*" 


= .... ) 
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if (c. empty () ) { 

cout << "(empty)" << endl ; 



for (it = c. begin 0; 
cout « endl; 



nd() ; it++) 



front " « c.front 

backO " « c.backO « endl; 



,emplate<typename Container Of Int> 
■old basicOps (char* s) { 

cout « " " « s « " " « endl; 

typedef ContainerOf Int Ci; 

Ci c; 

print(c, "c after default constructor"); 

Ci c2(10, 1); // 10 elements, values all 1 

print(c2, "c2 after constructor (1 , 1 )") ; 

int ia[] = ! 1, 3, 5, 7, 9 ]; 

const int iasz = sizeof (ia ) /sizeof ( *ia ) ; 

// Initialize with begin S end iterators: 

Ci c3(ia, ia + iasz); 

print(c3, "c3 after constructor ( iter , iter )") ; 

Ci c4(c2); // Copy-constructor 

print(c4, "c4 after copy-constructor ( c2 )") ; 

c = c2; // Assignment operator 

print(c, "c after operator=c2 " ) ; 

c.assign(10, 2); // 10 elements, values all 2 

print(c, "c after assign(10, 2)"); 

// Assign with begin S end iterators: 

c.assign(ia, ia + iasz); 

print(c, "c after assign (iter, iter)"); 



typename Ci : : reverse_it 
while(rit != c.rendO ) 



= c.rbeginO; 
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print(c, "c after resizel4)"); 

c.push_back (47) ; 

print(c, "c after push_back ( 4 7 ) " ) ; 

c.pop_back(); 

print (c, "c after pop_back ( } " ) ; 

typename Ci::iterator it = c.beginl); 

c.insertlit, 74); 

priiit(c, "c after insertlit, 74)"}; 

it = c. begin () ; 

c. insert (it, 3, 96); 

print(c, "c after insertlit, 3, 96)"); 

it = c.beginl); 

c.insertlit, c3. beginl), c3 . end ( ) ) ; 
print (c, "c after insert (" 

"it, c3. beginl), c3.endl))"); 

c. erase (it); 

print(c, "c after eraselit)"); 

typename Ci :: iterator it2 = it = c.beginl); 

it2++; it2++; it2++; it2++; it2++; 
c. eraselit, it2); 

print(c, "c after eraselit, it2)"); 
c . swap (c2 ) ; 

print(c, "c after swaplc2)"); 
C.Clear 1) ; 

print Ic, "c after clear 1 )") ; 
( 

int mainl) ! 

basicOps<deqiie<int> > 1 "deque" ) ; 

} ///:- 

The first function template, print( ), demonstrates the basic information you can get from any 
sequence container: whether it's empty, its current size, the size of the largest possible 

r, the element at the beginning and the element at the end. You can also see that every 
:r has begin( ) and end( ) methods that return it 
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The basicOps( ) function tests everything else (and in turn calls print( )), including a variety 
of constructors: default, copy-constructor, quantity and initial value, and beginning and 
ending iterators. There's an assignment ope rator= and two kinds of assign() member 
fiinctions, one which takes a quantity and initial value and the other which take a beginning 
and ending iterator. 

All the basic sequence containers are reversible containers, as shown by the use of the 
rbegin( ) and rend( ) member functions. A sequence container can be resized, and the entire 
contents of the container can be removed with clear( ). 

Using an iterator to indicate where you want to start inserting into any sequence container, 
you can inser(( ) a single element, a number of elements that all have the same value, and a 
group of elements from another container using the beginning and ending iterators of that 
group. 

To erase( ) a single element from the middle, use an iterator; to erase( ) a range of elements, 
use a pair of iterators. Notice that since a list only supports bidirectional iterators, all the 
iterator motion must be performed with increments and decrements (if the containers were 
limited to vector and deque, which produce random-access iterators, then operator+ and 
operator- could have been used to move the iterators in big jumps). 

Although both list and deque support push_front( ) and pop_front( ), vector does not, so the 
only member functions that work with all three are push_back( ) and pop_back( ). 

The naming of the member function swap( ) is a little confusing, since there's also a non- 
member swap( ) algorithm that switches two elements of a container. The member swap( ), 
however, swaps everything in one container for another (if the containers hold the same type), 
effectively swapping the containers themselves. There's also a non-member version of this 
function. 



The following sections on the sequence containers d iscuss the particulars of each type of 



vector 



f 1 ( vector is intentionally made to look like a souped-up array, since it has array-style 
indexing but also can expand dynamically, vector is so fundamentally useful that it was 
introduced in a very primitive way early in this book, and used quite regularly in previous 
examples. This section will give a more in-depth look at vector. 



To achieve maximally-fast indexing and iteration, the vector maintains its storage as a single 
contiguous array of objects. This is a critical point to observe in understanding the behavior of 
vector. It means that indexing and iteration are lighting-fast, being basically the same as 
indexing and iterating over an array of objects. But it also means that inserting an object 
anywhere but at the end (that is, appending) is not really an acceptable operation for a vector. 
It also means that when a vector runs out of pre-allocated storage, in order tt 
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contiguous array it must allocate a whole new (larger) chunk of storage elsewhere and copy 
the objects to the new storage. This has a number of unpleasant side effects. 

Cost of overflowing allocated storage 

A vector starts by grabbing a block of storage, as if it's taking a guess at how many objects 
you plan to put in it. As long as you don't try to put in more objects than can be held in the 
initial block of storage, everything is very rapid and efficient (note that if you<io know how 
many objects to expect, you can pre-al!ocate storage using reserve( )). But eventually you 
will put in one too many objects and, unbeknownst to you, the vector responds by: 

1 . Allocating a new, bigger piece of storage 

2. Copying all the objects from the old storage to the new (using the copy-constructor) 

3. Destroying all the old objects (the destructor is called for each one) 

4. Releasing the old memory 

For complex objects, this copy-con struct ion and destruction can end up being very expensive 
if you overfill your vector a lot. To see what happens when you're filling a vector, here is a 
class that prints out information about its creations, destructions, assignments and copy- 
constructions: 

// : C04 iNoisy .h 

#ifndef NOISY_H 
#define NOISY_H 



bl 


ic: 






No 


i3Y( 


: 


id 




std: 


CO 


it 


No 


isy (c 


on 


t 




std: 


CO 


It 




copyc 


on 




No 


isyfi 


opera 




std: 


CO 


It 




rv 


id 


« 




id = 


rv 


id 
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friend bool 

operator< (const Woisy& Iv, const NoisyS rv) { 

return Iv.id < rv.id; 
1 

friend bool 
operator== (const NoisyS Iv, const NoisyS rv) { 

return Iv.id == rv.id; 
1 
-Noisy { 

std::cout « "-[" « id « "]"; 

destroy++; 
) 

friend std : : ostreamS 
operator<< (std: :ostream& os, const WoisyS n) { 

1 

friend class NoisyReport; 



truct NoisyGen { 
Noisy operator { return Noisy (); 



// A singleton. Will automatically report the 

class NoisyReport { 

static NoisyReport nr; 

NoisyReport {] // Private constructor 
public: 

-NoisyReport { 

std::cout « "\n \n" 

« "\nCopy-Constructions: " 
<< Noisy: :copycons 

<< "\nDestructions : " << Noisy :: destroy 
« std: :endl; 



} 



); 



of these this file 
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/ . cpp file for more complex programs: 
onq Noisy: :create = 0, Noisy: :assign = 0, 
Noisy: :copycons = 0, Noisy: :destroy = 0; 
oisyReport NoisyReport: :nr; 
endif // NOISY_H ///:- 

Each Noisy object has its own identifier, and there are static variables to keep track of all the 
creations, assignments (using operator=), copy-constructions and destructions. The id is 
initialized using the create counter inside the default constructor; the copy-constructor and 
assignment operator take their id values from the rvalue. Of course, with operator= the lvalue 
is already an initialized object so the old value of id is printed before it is overwritten with the 
id from the rvalue. 

In order to support certain operations like sorting and searching (which are used implicitly by 
some of the containers). Noisy must have an operator< and operator==. These simply 
compare the id values. The operator« for ostream follows the standard form and simply 
prints the id. 

NoisyGen produces a function object (since it has an operator( )) that is used to 
automatically generate Noisy objects during testing. 

NoisyReport is a type of class called a singleton, which is a "design pattern" (these are 
covered more fully in Chapter XX). Here, the goal is to make sure there is one and only one 
NoisyReport object, because it is responsible for printing out the results at program 
termination. It has a private constructor so no one else can make a NoisyReport object, and a 
single static instance of NoisyReport called nr. The only executable statements are in the 
destructor, which is called as the program exits and the static destructors are called; this 
destructor prints out the statistics captured by the static variables in Noisy. 

The one snag to this header file is the inclusion of the definitions for the statics at the end. If 
you include this header in more than one place in your project, you'll get multiple-defmition 
errors at link time. Of course, you can put the static definitions in a separate cpp file and link 
it in, but that is less convenient, and since Noisy is just intended for quick -and -dirty 
experiments the header file should be reasonable for most situations. 

Using Noisy .h, the following program will show the behaviors that occur when a vector 
overflows its currently allocated storage: 



// 


C04:VectorOverflow.cpp 




// 


Shows the copy-constructio 


n and destr 


// 


That occurs when a vector 


must reallo 


// 


(It maintains a linear arr 


ay of eleme 


#i 


iclude "Noisy .h" 




#i 


iclude ".. /require. h" 




#i 


iclude <vector> 




Jii 


iclude <iostream> 
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^include <string> 
#include <cstdlib> 
using namespace std; 

int main lint argc, char*" argv [ ] ) { 
requireArgs (argc, 1) ; 

if (argc >= 2) size = atoi ( argv [ 1 ] ) ; 
vector<Noisy> vn ; 

vn.push_back (n) ; 

cout « "Xn cleaning up \n" ; 
} ///:- 

You can either use the defauU value of 1000, or use your own value by putting it on the 
command -line. 

When you run this program, you'll see a single default constructor call (for n), then a lot of 
copy-constructor calls, then some destructor calls, then some more copy-constructor calls, and 
so on. When the vector runs out of space in the linear array of bytes it has allocated, it must 
(to maintain all the objects in a linear array, which is an essential part of its job) get a bigger 
piece of storage and move everything over, copying first and then destroying the old objects. 
You can imagine that if you store a lot of large and complex objects, this process could 
rapidly become prohibitive. 

There are two solutions to this problem. The nicest one requires that you know beforehand 
how many objects you're going to make. In that case you can use reserve( ) to tell the vector 
how much storage to pre-allocate, thus eliminating all the copies and destructions and making 
everything very fast (especially random access to the objects with openitor[ ]). Note that the 
use of reserve() is different from using the vector constructor with an integral first argument; 
the latter initializes each element using the default copy -constructor. 

However, in the more general case you won't know how many objects you'll need. If vector 
reallocations are slowing things down, you can change sequence containers. You could use a 
list, but as you'll see, the deque allows speedy insertions at either end of the sequence, and 
never needs to copy or destroy objects as it expands its storage. The deque also allows 
random access with operate r[ ], but it's not quite as fast as vector's operate rf ]. So in the 
case where you're creating all your objects in one part of the program and randomly accessing 
them in another, you may find yourself filling a deque, then creating a vector from the deque 
and using the vector for rapid indexing. Of course, you don't want to program this way 
habitually, just be aware of these issues (avoid premature optimization). 

There is a darker side to vector's reallocation of memory, however. Because vector keeps its 
objects in a nice, neat array (allowing, for one thing, maximally-fast random access), the 

s used by vector are generally just pointers. This is a good thing - of all the sequence 
i, these pointers allow the fastest selection and manipulation. However, consider 
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what happens when you're holding onto an iterator (i.e. a pointer) and then you add the one 
additional object that causes the vector to reallocate storage and move it elsewhere. Your 
pointer is now pointing off into nowhere: 

// : CO 4 [Vector Cor eDump.cpp 

#include <vector> 
#incliide <io3tream> 

using namespace std; 

int mainl) { 

vector<int> vi(10, ) ; 

ostream_iterator<int> out (cout, " "); 

copY(vi.begin(), vi . end ( ) , out); 

vector<int>: [iterator i = vi.beginl); 

cout « "Xn i: '■ « long(i) « endl; 

*i = 47; 

copy (vi.beginO , vi . end () , out); 

// Force it to move memory (could also just add 

// enough objects) : 

vi.re3ize(vi.capacity() + 1); 

// Now i points to wrong memory: 

cout « "Xn i: '■ « long(i) « endl; 

cout « "vi.beginO: " « long ( vi . begin ()) ; 

} III:- 

If your program is breaking mysteriously, look for places where you hold onto an iterator 
while adding more objects to a vector. You'll need to get a new iterator after adding 
elements, or use operator[ ] instead for element selections. If you combine the above 
observation with the awareness of the potential expense of adding new objects to a vector, 
you may conclude that the safest way to use one is to fill it up all at once (ideally, knowing 
first how many objects you'll need) and then just use it (without adding more objects) 
elsewhere in the program. This is the way vector has been used in the book up to this point. 

You may observe that using vector as the "basic" container in the earlier chapters of this book 
may not be the best choice in all cases. This is a fundamental issue in containers, and in data 
structures in general: the "best" choice varies according to the way the container is used. The 
reason vector has been the "best" choice up until now is that it looks a lot like an array, and 
was thus familiar and easy for you to adopt. But from now on it's also worth thinking about 
other issues when choosing containers. 

Inserting and erasing elements 
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1 . You reserve( ) the correct amount of storage at the beginning so the vector n 
reallocate. 

2. You only add and remove elements from the back end. 

It is possible to insert and erase elements from the middle of a vector using an itei 
following program demonstrates what a bad idea it is: 



//: C04:VectorInsertAndErase. 


// Erasing an element from a 




nclude "Noisy. h" 




nclude <iostream> 




nclude <vector> 




nclude <algorithm> 


using namespace std; 


it 


t main () { 




vector<Noisy> v; 




v.reserve(ll); 




cout « "11 spaces have bee 




generate_n (back_inserter (v) 




ostream_iterator<Noisy> out 




cout « endl; 




copy (v.beginO , v. endl), ou 




cout « "Inserting an eleme 




vector<Noisy>: [iterator it 




V.beginO + v. size () / 2; 




v.insertdt, NoisyO ) ; 




cout « endl; 




copy (v. begin 1) , v. endl), ou 




cout « "\nErasing an eleme 




// Cannot use the previous 




it = V.beginO + v.sizel) / 




V. erase (it) ; 




cout « endl; 




copy (v. begin 1) , v.endO, ou 



When you run the program you'll see that the call to reserve( ) really does only allocate 
storage - no constructors are called. The generate_n( ) call is pretty busy: each call to 
Noisy Gen: topera to r( ) results in a construction, a copy -construction (into the vector) and a 
destruction of the temporary. But when an object is inserted into the vector in the middle, it 
must shove everything down to maintain the linear array and - since there is enough space - i 
does this with the assignment operator {if the argument of reserve( ) is 10 instead of eleven 
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then it would have to reallocate storage). When an object is erased from the vector, the 
assignment operator is once again used to move everything up to cover the place that is being 
erased (notice that this requires that the assignment operator properly cleans up the lvalue). 
Lastly, the object on the end of the array is deleted. 

You can imagine how enormous the overhead can become if objects are inserted and removed 
from the middle of a vector if the number of elements is large and the objects are 
complicated. It's obviously a practice to avoid. 



deque 



I li ( deque (double-ended-queue, pronounced "deck") is the basic sequence ci 
optimized for adding and removing elements from either end. It also allows for reasonably 
fast random access - it has an operator[ ] like vector. However, it does not have vector' s 
constraint of keeping everything in a single sequential block of memory. Instead, deque uses 
multiple blocks of sequential storage (keeping track of all the blocks and their order in a 
mapping structure). For this reason the overhead for a deque to add or remove elements at 
either end is very low. In addition, it never needs to copy and destroy contained objects during 
a new storage allocation (like veclor does) so it is far more efficient than vector if you are 
adding an unknownquantity of objects. This means that vector is the best choice only if you 
have a pretty good idea of how many objects you need. In addition, many of the programs 
shown earlier in this book that use vector and push_back( ) might be more efficient with a 
deque. The interface to deque is only slightly different from a vector (deque has a 
push_front( ) and pop_front( ) while vector does not, for example) so converting code from 
using vector to using deque is almost trivial. Consider StringVector.cpp, which can be 
changed to use deque by replacing the word "vector" with "deque" everywhere. The 
following program adds parallel deque operations to the vector operations in 
StringVector.cpp, and performs timing comparisons: 

I // : C04 :StringOeque.cpp 

// Converted from StringVector.cpp 

#include ".. /require . h" 

#include <string> 

#include <deque> 

#include <vector> 

#include <fstream> 

#include <io3tream> 

#include <iterator> 

#include <sstream> 

#include <ctime> 

using namespace std; 
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ifstream in(argv[l] ) ; 
assure (in, argv[l]); 

string line; 

// Time reading into vector: 
clock t ticks = clock [^ ' 
while (getline (in, line)) 
vstrings.push_back(line) ; 

// Repeat for deque: 
ifstream in2(argv[l] ); 
assure (in2, argv[l]); 
ticks = clockO; 
while (getline (in2, line) ) 
dstrings .push_back (line) ; 

cout << "Read into deque: " 
// Now compare indexing: 
ticks = clockO ; 
for(int i = 0; i < vstrings . 










1 



ticks = clockO - ticks; 

cout << "Indexing vector: " << ticks << end 

ticks = ClockO; 

for(int j = 0; j < dstr ings . s i ze () ; j++) { 



mgs 



ings[j] = 



r O 



dstrings [j]; 



1 



ticks = clockO - ticks; 
cout << "Indexing deqeue : " << ticks 
// Compare iteration 

of stream tmpl ( "tmpl . tmp" ) , tmp2 ( " tmp2 . tmp" ) ; 
ticks = clockO ; 

copy (vstrings .begin () , vstrings .end ( ) , 
ostream_iterator<string> (tmpl, "\n") ) ; 



dl; 



dl; 
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ticks = clock 1 ) ; 

copy (dstrings.beginO , dstrings.endl) , 
O3tream_iterator<3triiig> ltmp2, "\n") ) ; 



} ///:- 

Knowing now what you do about the inefficiency of adding things to vector because of 
storage reallocation, you may expect dramatic differences between the two. However, on 
Megabyte text file one con:q>iler's program produced the following (measured in 
platform/compiler specific clock ticks, not seconds): 



ead into vector 


8350 


ead into deque: 


7690 


ndexing vector: 


2360 


ndexing deqeue : 


2480 


terating vector 


2470 


terating deqeue 


2410 



A different compiler and platform roughly agreed with this. It's not so dramatic, is it? This 
points out some important issues: 

1 . We (programmers) are typically very bad at guessing where inefficiencies occui' in oui' 
programs. 

2. Efficiency comes from a combination of effects -here, reading the lines in and 
converting them to strings may dominate over the cost of the vector vs. deque. 

3. The string class is probably fairly well-designed in terms of efficiency. 

Of course, this doesn't mean you shouldn't use a deque rather than a vector when you know 
that an uncertain number of objects will be pushed onto the end of the container. On the 
contrary, you should — when you're tuning for performance. But you should also be aware 
that performance issues are usually not where you think they are, and the only way to know 
for sure where your bottlenecks are is by testing. Later in this chapter there will be a jnore 
"pure" comparison of performance between vector, deque and list. 

Converting between sequences 

Son eljiu et vo^ nted Ibt biliiHor o[ eflidency o I o n t k In d o f to Dtiiner lor one |i irt of )'o m 
lirogrini , ind i dilfirenl t o i tiller's b eb iv io r o r elflcieicj' In in o Ihei |i irt of llie |i ro • n ni . For 
en m p le, y II m ly need Ihe e [line icy c I i deque when adding objects to the container but the 
efficiency of a vector when indexing them. Each of the basic sequence containers (vector, 
deque and list) has a two-iterator constructor (indicating the beginning and ending of the 
sequence to read from when creating a new object) and an assign( ) member function to read 
into an existing container, so you can easily move objects from one sequence container to 
another. 
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following example reads objects into a deque and then con\ 


erts to a vector 


// : CO 4 [DequeConversion.cpp 




// Reading into a Deque, converting to a 


vector 


#include "Noisy. h" 




#include <deque> 




#include <vector> 




#include <iostream> 




linclude <algorithm> 




linclude <cstdlib> 




using namespace std; 




int main lint argc, char* argv [ ] ) { 




int size = 25; 




if(argc >= 2) size = atoi (argv [ 1 ] ) ; 





deque<Noisy> d; 

generate_n(back_inserter (d) , size, NoisyGen ( ) ) ; 
cout « "\n Converting to a vector (1)" « endl; 
vector<Noisy> vl (d. begin () , d.end ( ) ) ; 

vector<Noisy> v2 ; 
v2.re3erve(d.size() ); 
v2 .assign (d. begin () , d.end () ) ; 
cout « "\n Cleanup" « endl ; 
} ///:- 

You can try various sizes, but you should see that it makes no difference - the objects are 
simply copy-construcfed into the new vectors. What's interesting is that vl does not cause 
multiple allocations while building the vector, no matter how many elements you use. You 
might initially think that you must follow the process used for v2 and preallocate the storage 
to prevent messy reallocations, but the constructor used for vl determines the memory need 
ahead of time so this is unnecessary. 

Cost of overflowing allocated storage 

It's illiiii intiiif lo ste « bii 1 ifpeos « ilh i deque when it overflows a block of storage, in 
contrast with VectorOverflow.cpp: 

// : CO 4 :DequeOverflow.cpp 

// A deque is much more efficient than a vector 

// when pushing back a lot of elements, since it 

// doesn't require copying and destroying. 

linclude "Noisy. h" 

# i n c 1 u de < de qu e > 

#include <cstdlib> 
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int size = IDOO; 

if(argc >= 2) size = atoi ( argv [ 1 ] ) ; 

deque<Noisy> dn ; 

dn.push_back (n) ; 

cout « "Xn cleaning up \n" ; 
} ///:- 

Here you will never see any destructors before the words "cleaning up" appear. Since the 
deque allocates all its storage in blocks instead of a contiguous array like vector, it never 
needs to move existing storage (thus no additional copy-constructions and destructions occur). 
It simply allocates a new block. For the same reason, the deque can just as efficiently add 
elements to the beginn ing of the sequence, since if it runs out of storage it (again) just 
allocates a new block for the beginning. Insertions in the middle of a deque, however, could 
be even messier than for vector (but not as costly). 

Because a deque never moves its storage, a held iterator never becomes invalid when you add 
new things to either end of a deque, as it was demonstrated to do with vector (in 
VectorCoreDump.cpp). However, it's still possible (albeit harder) to do bad things: 

// : C04 :DequeCoreOump.cpp 

// How to break a program using a deque 

#include <queue> 

#include <iostream> 

using namespace std; 

int mainl) { 

deque<int> dillOO, ) ; 

// No problem iterating from beginning to end, 
// even though it spans multiple blocks: 
copy (di. begin 0, di . end ( ) , 

ostream_iterator<int>(cout, " " ) ) ; 
deque<int>: [iterator i = // In the middle: 

di.beginO + di.sizeO / 2;; 
// Walk the iterator forward as you perform 
// a lot of insertions in the middle: 
for lint j = 0; j < 10 0; j + +) { 

cout « j « endl; 

di. insert li + + , 1); // Eventually breaks 
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Of course, there are two things here that you wouldn't normally do with a deque: first, 
elements are inserted in the middle, which deque allows but isn't designed for. Second, 
calling iiisert( ) repeatedly with the same iterator would not ordinarily cause an access 
violation, but the iterator is walked forward after each insertion. I'm guessing it eventually 
walks off the end of a block, but I'm not sure what actually causes the problem. 

If you stick to what deque is best at - insertions and removals from either end, reasonably 
rapid traversals and fairly fast random-access using operator[ ] — you'll be in good shape. 



Checked random-access 



B a It vector and deque provide two ways to perform random access of their elements: the 
operator[ ], which you've seen already, and at( ), which checks the boundaries of the 
container that's being indexed and throws an exception if you go out of bounds. It does cost 
more to use at(): 

// : CO 4 ilndexingVsAt.cpp 

// Comparing "at () " to operator!] 

#include ".. /require . h" 

#include <vector> 

#include <deque> 

#include <iostream> 

#include <ctime> 

using namespace std; 

int main lint argc, char*" argv [ ] ) { 
requireMinArgs (argc, 1) ; 
long count = 1000; 
int sz = 1000; 

if (argc >= 2) count = atoi ( argv [ 1 ] ) ; 
if (argc >= 3) sz = atoi ( argv [ 2 ] ) ; 
vector<int> vi (sz); 
clock t ticks = clock [1 ■ 
for (int 11 = 0; 11 < count; il + +) 

for (int j = 0; j < sz; j + +) 
vi[j]; 
cout « "vector[]" « clock ( ) - ticks « endl ; 
ticks = clock 0; 
for (int 12 = 0; 12 < count; 12 + +) 

fordnt j = 0; j < sz; j + +) 
vi . at ( j ) ; 
cout << "vector : :at " << clock () -ticks <<endl ; 
deque<int> di ( s z ) ; 
ticks = clock(); 
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for lint 


i3 = 0; 


i3 < 


cou 


nt; 


3 + + 










for (i 


t j = 


J < 


sz; 


j + + 












di( 


]; 


















cout « 


"deque! 




clo 


ckl) 


- t 


ck 


< 


c e 


dl; 


ticks = 


clockl) 


















for (int 


i4 = 0; 


±4 < 


cou 


nt; 


4 + + 










for(i 


t j = 


j < 


sz; 


j + + 












di . at ( j ) ; 


















cout « 


"deque: 


at 


« 


clo 


-k() 


ticks 


« 


ndl; 


/ / Demo 


strate = 


t 


vhen 


you 


go 


ut 


of 


bo 


nds : 


di.at (V 


.size 1) 


+ 1} 
















} ///:- 





















As you'll learn in the exception-handling chapter, different systems may handle the iincaught 
exceplion in different ways, but you'll know one way or another that something went wrong 
with the program when using at( ), whereas it's possible to go blundering ahead using 
operator[ ]. 



list 



l\ list is implemented as a doubly-linked list and is thus designed for rapid insertion and 
removal of elements in the middle of Ihe sequence (whereas for vector and deque this is a 
much more costly operation). A list is so slow when randomly accessing elements that it does 
not have an operator[ ]. It's best used when you're traversing a sequence, in order, from 
beginning to end (or end to beginning) rather than choosing elements randomly from the 
middle. Even then the traversal is significantly slower than either a vector or a deque, but if 
you aren't doing a lot of traversalsfhat won't be your bottleneck. 

Another thing to be aware of with a list is the memory overhead of each link, which requires a 
forward and backward pointer on top of the storage for the actual object. Thus a list is a better 
choice when you have larger objects that you'll be inserting and removing from the middle of 
the list. It's better not to use a list if you think you might be traversing it a lot, looking for 
objects, since the amount of time it takes to get from the beginning of the list - which is the 
only place you can start unless you've already got an iterator to somewhere you know is 
closer to your destination - to the object of interest is proportional to the number of objects 
between the beginning and that object. 

The objects in a list never move after they are created; "moving" a list element means 
changing the links, but never copying or assigning the actual objects. This means that a held 
iterator never moves when you add new things to a list as it was demonsfrated to do in vector. 
Here's an example using the Noisy class: 

I // : C04 :ListStability .cpp 

// Things don't move around in lists 
#include "Noisy. h" 
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♦include <list> 
linclude <io3tream> 
linclude <algorithm> 

int mainl) { 

list<Noisy> 1; 

ostream_iterator<Noisy> out (cout, " "); 

generate_n(back_inserter (1) , 2 5, NoisyGen() ) ; 

cout << "\n Printing the list:" << endl; 

copY(l.begin(), l.endO, out); 

cout << "\n Reversing the list:" << endl; 

l.reverseO; 

copy (1. begin , 1 . end () , out); 

cout « "\n Sorting the list:" « endl; 

l.sortO; 

copy (1. begin 1) , 1 . end ( ) , out); 

cout << "\n Swapping two elements:" << endl; 

list<Noisy>: :iterator itl, it2; 

itl = it2 = 1. begin () ; 

it2++; 

3wap(*itl, '-it2); 

cout « endl; 

copyd.beginO, 1 . end ( ) , out); 

cout « "\n Using generic reversel): " « endl; 

reverse(l.begin(), 1 . end () ) ; 

cout « endl; 

copy (1. begin 1) , 1 . end ( ) , out); 

cout « "\n Cleanup" « endl; 
} ///:- 

Operations as seemingly radical as reversing and sorting the iist require no copying of objects, 
because instead of moving the objects, the links are simply changed. However, notice that 
sort( ) and reverse( ) are member functions of list, so ihey have special knowledge of the 
internals of list and can perform the pomter movement instead of copying. On the other hand, 
the swap( ) function is a generic algorithm, and doesn't know about list in particular and so it 
uses the copying approach for swapping two elements. There are also generic algorithms for 
sort() and reverse( ), but if you try to use these you'll discover that the generic reverse( ) 
performs lots of copying and destruction (so you should never use it with a list) and the 
generic sort( ) simply doesn't work because it requires random-access iterators that list 
doesn't provide (a definite benefit, since this would certainly be an expensive way to sort 
compared to list's own sort( )). The generic sort( ) and re¥erse( ) should only be used with 
arrays, vectors and deques. 



Chapter 15: Multiple Iiiliei 



If you have large and complex objects you may want to choose a list first, especially if 
construction, destruction, copy-construction and assignment are expensive and if you are 
doing things like sorting the objects or otherwise reordering them a tot. 



Special list operations 



T li e list has some special operations that are built-in to make the best use of the structure of 
the list. You've already seen reverse( ) and sort( ), and here are some of the others in use: 

// : CO 4 :Li3tSpecialFunctions . cpp 
linclude "Noisy. h" 
#include <list> 
#include <io3tream> 
linclude <algorithm> 

ostream_iterator<Noi3y> out(cout, " "); 

void print (list<Noisy>S In, char* comment = "") | 
cout « "Xn" « comment « '■:\n"; 
copydn.beginO, ln.end(), out); 



t main () { 






typedef li3t<Noi3y> LN; 






LN 11, 12, 13, 14; 






qenerate_n (back_inserter (11) , 6 


NoisyGe 


n 1 ) ) ; 


generate_n(back_inserter (12) , 6 


NoisyGe 


nO); 


generate_n(back_inserter (13) , 6 


NoisyGe 


nO); 


generate_n(back_inserter (14) , 6 


NoisyGe 


nO); 


print (11, "ll"); print (12, '■12" 


; 




print (13, "13"); print (14, "14" 


; 




LN::iterator itl = ll.beginl); 






itl++; itl++; itl++; 






ll.splice(itl, 12); 






print (11, "11 after splice (itl. 


12)") ; 




print (12, "12 after splice (itl. 


12)") ; 




LN::iterator it2 = 13.beginl); 






it2++; it2++; it2++; 






11. splice (itl, 13, it2); 






print(ll, "11 after splicelitl. 


13, it2) 


') ; 


LN::iterator it3 = 14.beginl), 


t4 = 14. 


end 1 ) ; 


it3++; it4 — ; 
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t4) ; 

e litl,14, i 



LN 


15(3, 


n) ; 






ger 


erate_ 


n(back_inserter (15) , 


4, No 


15 


push_b 


ack(n); 






pri 


nt(15. 


"15 before ren 


ove 


") ; 


15 


remove 


(15. front ); 






pri 


nt(15. 


"15 after rem 


ve ' 


) ; 


11 


sortO 


■ 15. sort ; 






15 


merge (11) ; 






pri 


nt(15. 


"15 after 15. d 


erge (11) ") ; 


COL 


t « " 


\n Cleanup" << 


endl ; 




} // 


:~ 









The priDt( ) function is used to display results. After filling four lists with Noisy objects, one 
list is spliced into another in three different ways. In the first, the entire list 12 is spliced into 11 
at the iterator itl. Notice that after the splice, 12 is empty - splicing means removing the 
elements from the source list. The second splice inserts elements from 13 starting at it2 into II 
starting at itl. The third splice starts at itl and uses elements ftom 14 starting at it3 and ending 
at it4 (the seemingly-redundant mention of the source list is because the elements must be 
erased from the source list as part of the transfer to the destination list). 

The output from the code that demonstrates reniove( ) shows that the list does not have to be 
sorted in order for all the elements of a particular value to be removed. 

Finally, if you merge( ) one list with another, the merge only works sensibly if the lists have 
been sorted. What you end up with in that case is a sorted Ust containing all the elements from 
both lists (the source list is erased - that is, the elements are moved to the destination list). 

There's also a unique( ) member function that removes all duplicates, but only if the list has 
been sorted first: 

// : CO 4 :UniqueList.cpp 

// Testing list's unique () function 

#include <list> 

#include <iostreara> 

using namespace std; 

int a[] = { 1, 3, 1, 4, 1, 5, 1, 6, 1 1; 



nt mainO { 

// For output: 
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li. unique () ; 

// Oops! No duplicates removed: 
copy(li.begin(), li.endl), out); 
cout « endl; 

li.sort 0; 

copy (li. begin 1) , li.endl), out); 
cout « endl; 

// Now unique 1) will have an effect: 
li.uniqueO; 

copy (li. begin 1) , li.endl), out); 
cout « endl; 
} ///:- 

The list constructor used here takes the starting and past-the~end iterator from another 
container, and it copies all the elements from that container into itself (a similar constructor is 
available for all the containers). Here, the "container" is just an array, and the "iterators" are 
pointers into that array, but because of the design of the STL it works with arrays just as 
easily as any other container. 

If you run this program, you'll see that uniqoeO will only remove ao^/dccTi/ duplicate 
elements, and thus sorting is necessary before calling uiiique( ). 

There are four additional list member functions that are not demonstrated here: a reniove_if( ) 
that takes a predicate which is used to decide whether an object should be removed, a 
unique( ) that takes a binary predicate to perform uniqueness comparisons, a inerge( ) that 
takes an additional argument which performs comparisons, and a sort( ) that takes a 
comparator (to provide a comparison or override the existing one). 



list vs. set 



in give you that, right? It's interesting to compare the performance of the 



// : C04 :ListVsSi 
// Comparing li. 
#include <iostr. 
#include <list> 
#include <3et> 
#include <algor 
#include <ctime 
#include <cstdl 
using namespace 
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int a [20]; // To take up extra space 
int val; 
public: 

Obj : val (randO % 500) { } 

friend bool 

operator< (const Ob j 6 a, const Ob j & b) { 

) 

friend bool 

operator = = (const Ob j S a, const Ob j & b) { 



eam& os, const Ob j S a) | 



oid print (Containers c) { 



for(it = c.beginO; 



nd(); it++) 



truct ObjGen { 
Obj operator { return Ob j () ; } 



nt mainO { 
const int sz = 5000; 
srand(time (0) ) ; 
list<Obj> lo; 
clock_t ticks = clock O; 

generate_n (back_inserter (lo) , sz, ObjGen () ) ; 
lo.sortO; 
lo. unique () ; 

cout << "list:" << clock - ticks << endl; 
3et<Obj> so; 
ticks = clock ; 

generate_n (inserter (so, so. begin () ) , 
sz, ObjGenO); 
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print do); 
print (so); 

} ///:- 

When you run the program, you should discover that set is much faster than list. This is 
reassuring- after all, it is set's primary job description! 

Swapping all basic sequences 

II tj[iii MI thil ill bnic seqjtiifes tii ( i ni tin btr finictioii swap< ) that's designed to switch 
one sequence with another (however, this swap( ) is only defined for sequences of the same 
type). The member swap( ) makes use of its knowledge of the internal structure of the 
particular container in order to be efficient: 



// : CO 4 :Swapping.cpp 














// All basic sequence containei 


s can be swapped 


#include "Noisy .h" 














#include <list> 














#include <vector> 














#include <deque> 














linclude <iostreain> 














linclude <algorithm> 














using namespace std; 














ostream_iterator<Noi 


y> o 


t (COL 


t. 


") 






template<class Cont> 














void print (ContS c, char-- 


comme 


nt = 




{ 




cout « "\n" « cor 


linen t 


« " 










copy (c. begin () , c.s 


nd() 


out 










cout « endl; 
1 














template<class Cont> 














void testSwaptchar-- 


name 


{ 










Cont cl, c2; 














generate_n (back_in 


erter (cl) , 


10 


NoisyGen ( ) ) ; 


generate_n (back_inserter (c2 ) , 


5, 


Noi 


yGer 


) ; 



cout « "\n" « cname « " : " « en 
print(cl, "cl"); print(c2, "c2"); 
cout « "\n Swapping the " « cnam 

« ":" « endl; 
cl.swap(c2); 
print(cl, "cl"); print (c2, "c2"); 
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) 

int mainl) { 

testSwap<vector<Noisy> > ("vector") ; 

testSwap<deqiie<Noisy> > ( "deque" ) ; 

tests wap<list<Noisy> > ("list") ; 
} ///:- 

When you run this, you'll discover that each type of sequence container is able to swap one 
sequence for another without any copying or assignments, even if the sequences are of 
different sizes. In effect, you're completely swapping the memory of one object for another. 

The STL algorithms also contain a s\¥ap( ), and when this function is applied to two 

containers of the same type, it will use the member swap( ) to achieve fast performance. 
Consequently, if you apply the sort( ) algorithm to a container of containers, you will find 
that the performance is very fast- it turns out that fast sorting of a container of containers wa 
a design goal of the STL. 

Robustness of lists 

To b re 1 1 i list, you have to work pretty hard: 

// : CO 4 :ListRobustness . cpp 
// lists are harder to break 
#include <list> 
#include <iostream> 
using namespace std; 



int mainO | 
















li3t<int> li (100 


0) 












li3t<int>: [iterator 


= 


li.beg 


n 






for (int j = 


0; J 


< 1 


.s 


ze() / 


2; 


+ +) 




i + + ; 
















// Walk the 


iterator 


forward a 


yoL 


pe 


rf 


// a lot of 


insertio 


IS in the 


niddl 






for (int k = 


0; k 


< 1000 


k++) 








li. insert 


i + + . 


1) ; 


// 


No problem 






li.erase(i) 
















i + + ; 
















'■i = 2; // Oops! 


It' 


i 


ivalid 








} ///:- 

















When the link that the iterator i was pointing to was erased, it was unlinked from the list and 
thus became invalid. Trying to move forward to the "next link" from an invalid link is poorly- 
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formed code. Notice that the operation that broke deque in DeqiieCoreDuinp.cpp is 
perfectly fine with a list. 

Performance comparison 



:; 


: C04:SequencePerform 


[I Id; inii 
ance.cpp 


s ip 


;,;,' 


// 


Comparing the perfor 


■na 


ce of 


the 


ba 


// 


sequence containers 


for vari 


us 


ope 


#i 


nclude <vector> 










#i 


nclude <queue> 










#i 


nclude <list> 










#i 


nclude <iostream> 










#i 


nclude <string> 










#i 


nclude <typeinfo> 










#i 


nclude <ctime> 










!; 


nclude <cstdlib> 
ing namespace std; 










cl 


ass FixedSize { 
int x[20]; 












// Automatic generati 


on 


of defaul 


t c 




// copy-constructor a 


nd 


operator- 




} 


fs; 










template<class Coiit> 










St 


ruct InsertBack { 












void operator 1) (ContE 


c 


long 


cou 


nt) 




for (long i = ; i < 




unt; 


++) 






c.push_backlfs) ; 











emplate<class Cont> 
truct InsertFront { 
void operator 1) (ContS c, long . 
long cnt = count * 10; 
for (long i = ; i < cnt; i + + 
c.push_front (fs) ; 
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"InsertFront"; } 



template<class Cont> 
struct InsertMiddle | 

void operatorO (Contfi c, long count) { 
typename Cont :: iterator it; 
long cnt = count / 10; 
for (long i = 0; i < cnt; i + +) { 

// Must get the iterator every time to keep 
// from causing an access violation with 
// vector. Increment it to put it in the 
// middle of the container: 
it = c. begin () ; 



tlit, fs); 



"InsertMiddle"; } 



emplate<class Cont> 

truct RandomAccess ! // Not for li 
void operator (ContS c, long cou 

long cnt = count * 100; 
for (long i = 0; i < cnt; i + +} 
ctrandl) % sz] ; 



template<class Cont> 
struct Traversal { 

void operator 1) (ContS c, long count) { 
long cnt = count / 100; 
for (long i = 0; i < cnt; i + +) { 

typename Cont :: iterator it = c.beginO, 

end = c.endO ; 
while (it != end) it + + ; 



} 
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char* testWame ( ) { return "Traversal" ; } 



emplate<class Cont> 
truct Swap { 
void operator 1) (ContE c, long count) { 
int middle = c.sizel) / 2; 

mid = c. begin 1); 
it++; // Put it in the middle 

for lint X = 0; x < middle + 1; x++) 

mid++; 
long cnt = count * 10; 
for (long i = D; i < cnt; i + +) 
swap C-it, '-mid) ; 
1 
char* testNamel) { return "Swap"; } 



emplate<class Cont> 
truct RemoveMiddle ! 
void operator!) (ContS c, long count) { 
long cnt = count / 10; 
if lent > c.sizel) > { 

cout << "RemoveMiddle: not enough ele 
« endl; 



for Hong i = 0; i < cnt; i++) { 



char* testName 1 ) { return "RemoveMiddle 



emplate<class Cont> 
truct RemoveBack ! 
void operator 1) IContE c, long count) { 
long cnt = count * 10; 

if lent > c.sizel) > { 

cout << "RemoveBack: not enough ele 
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r (long i = 0; 
c.pop_back 1 ) ; 



oveBack"; } 



emplate<class Op, class Container> 

oid measureTime (Op f. Containers c, long count) 

string id (typeid (f ) . name ( ) ) ; 

bool Deque = id . find ( "deque" ) != str ing : : npos ; 

bool List = id.find("list") != string : :npos; 

bool Vector = id . find ( "vector " ) ! =str ing :: npos 

string cont = Deque ? "deque" : List ? "list" 
: Vector? "vector" : "unknown"; 

cout « f .testNameO « " for " « cont « ": 

// Standard C library CPU ticks: 

clock t ticks = clockn ' 

f(c, count); // Run the test 

cout « ticks « endl; 



ypedef 


deque<FixedSize> DF ; 




ypedef 


Li 


t<FixedSize> LF ; 




ypedef 


>/ector<FixedSij 


e> VF; 




nt main 


(i 


It argc, char*" argv [ ] ) { 


srand( 


ti 


ne 1 ) ) ; 






long c 


3U 


It = 1000; 






if(argc >= 2) count = 


atoi (ai 


gv[l 


DF deq 










LF 1st 










VF vec 




vecres; 






vecres 


r 


Bserve (count 


; //Pre 


alio 


measur 


bT 


Lme(InsertBac: 


k<VF> ( ) , 


vec 


measur 


bT 


Lme(InsertBac 


k<VF> ( ) , 


vec 


measur 


bT 


Lme(InsertBac 


k<DF>() , 


deq 


measur 


bT 


Lme(InsertBac 


k<LF>() , 


1st 


// Can 


t 


push_front ( 


with a 


vect 


/! meas 


jreTime ( InsertFront<VF> 


1) - 
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rtFr. 



tFront<LF> () 
tMiddle<VF> ( 
tMiddle<DF> ( 
tMiddle<LF> ( 
RandomAccess<VF> ( 
RandomAccess<DF> ( 



= (In, 



r[] 



ith , 



li 



ime (RandomAccess<LF> () , 1st 
s (Traversal<VF> ( ) , vec, coui 
s (Traversal<DF> ( ) , deq, coui 
s(Traversal<LF>() , 1st, cou: 
s(Swap<VF>() , vec, count); 
5(Swap<DF>() , deq, count); 
e(Swap<LF>() , 1st, count); 
e (RemoveMiddle<VF> () , vec, . 
e (RemoveMiddle<DF> () , deq, i 
e (RemoveMiddle<LF> () , 1st, i 
(vec.sizeO * 10); // Make 
e (RemoveBack<VF> () , vec, coi 
s (RemoveBack<DF> 1) , deq, co, 
e lRemoveBack<LF> () , Ist, co" 



} ///:- 



This example makes heavy use of templates to eliminate redundancy, save space, guarantee 
identical code and improve clarity. Each test is represented by a class that is templatized on 
the container it will operate on. The test itself is inside the operator( ) which, in each case, 
takes a reference to the container and a repeat count — this count is not always used exactly as 
it is, but sometimes increased or decreased to prevent the test from being too short or too long. 
The repeat count is just a factor, and all tests are compared using the same value. 

Each test class also has a member function that returns its name, so that it can easily be 
printed. You might think that this should be accomplished using run-time type identification, 
but since the actual name of the class involves a template expansion, this turns out to be the 
more direct approach. 

The ineasureTime( ) function template takes as its first template argument the operation that 
it's going to test — which is itself a class template selected from the group defined previously 
in the listing. The template argument Op will not only contain the name of the class, but also 
(decorated into it) the type of the container it's working with. The RTTl typeid( ) operation 
allows the name of the class to be extracted as a char*, which can then be used to create a 
string called id. This string can be searched using string::find( ) to look for deque, list or 
vector. The bool variable that corresponds to the matching string becomes true, and this is 
used to properly initialize the string cont so the container name can be accurately printed, 
along with the test name. 
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Once the type of test and the contahier being tested has been printed out, the actual test is 
quite simple. The Standard C library function clock( ) is used to capture the starting and 
ending CPU ticks (this is typically more fme-grained than trying to measure seconds). Since f 
is an object of type Op, which is a class that has an openitor( ), the line: 

I f(c, count); 

is actually calling the operate r( ) for the object f. 

In main(), you can see that each different type of test is run on each type of container, except 
for the containers that don't support the particular operation being tested (these are 
commented out). 

When you run the program, you'll get comparative performance numbers for your particular 
compiler and your particular operating system and platform. Although this is only intended to 
give you a feel for the various performance features relative to the other sequences, it is not a 
bad way to get a quick-and-dirty idea of the behavior of your library, and also to compare one 
library with another. 



set 



I i t set produces a container that will accept only one of each thing you place in it; it also 
sorts the elements (sorting isn't intrinsic to the conceptual defmition of a set, but the STL set 
stores its elements in a balanced binary tree to provide rapid lookups, thus producing sorted 
results when you traverse it). The firet two examples in this chapter used sets. 

Consider the problem of creating an index for a book. You might like to start with all the 
words in the book, but you only want one instance of each word and you want them sorted. Of 
course, a set is perfect for this, and solves the problem effortlessly. However, there's also the 
problem of punctuation and any other non-alpha characters, which must be stripped off to 
generate proper words. One solution to this problem is to use the Standard C library function 
strtok( ), which produces tokens (in our case, words) given a set of delimiters to strip out: 

// : CO 4 iWordList.cpp 

// Display a list of words used in a document 

#include ".. /require . h" 

linclude <string> 

#include <cstring> 

linclude <set> 

linclude <io3tream> 

linclude <f3tream> 



>:{! []+- = £^#.,/\\~ 
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nt main lint argc, char' argv [ ] ) { 
requireArgs (argc, 1 ) ; 
ifstream in(argv[l] ); 
assure(in, argv[l]); 

string line; 

while (getline (in, line)) { 

// Capture individual words: 
char*- 3 = 11 Cast probably won't c 
strtokl lchar'-)line.c_str 1), deli 
while (s) ! 

// Automatic type conversion: 

s = strtoklO, delimiters); 



// Output results: 

copy (wordli St. begin () , wordl ist . end ( ) , 

ostream_iterator<string> (cout, "\n") ) ; 
} ///:- 

stiiok( ) takes the starting address of a character buffer (the first argument) and looks for 
delimiters (the second argument). It replaces the delimiter with a zero, and returns the address 
of the beginning of the token. If you call it subsequent times with a first argument of zero it 
will continue extracting tokens from the rest of the string until it finds the end. In this case, 
the delimiters are those that delimit the keywords and identifiers of C++, so it extracts these 
keywords and identifiers. Each word is turned into a string and placed into the wordlist 
vector, which eventually contains the whole file, broken up into words. 

You don't have to use a set just to get a sorted sequence. You can use the sort( ) function 
(along with a multitude of other functions in the STL) on different STL containers. However, 
it's likely that set will be faster. 



Eliminating strtok( ) 



Sone projrii m tri fODsider slrl)t( | to be the poorest design in the Standard C library 
because it uses a static buffer to hold its data between function calls. This means: 



1. 


Youciin't 


use strtok( ) in two places at the same time 




2. 


You can't 


use strtok() in a multithreaded program 




3. 


You can't 
program 


use strtok( ) in a library that might be used in 


a muhithreaded 


4. 


strtok( ) 


nodifies the input sequence, which can produc 


e unexpected sid 
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5. strtok( ) depends on reading in "lines", which means you need a buffer big 

enough for the longest line. This produces both wastefuUy- sized buffers, 
and lines longer than the "longest" line. This can also introduce security 
holes. (Notice that the buffer size problem was eliminated in WordList.cpp 
by using string input, but this required a cast so that strtok( ) could modify 
the data in the string - a dangerous approach for general -purpose 
programming). 
For all these reasons it seems like a good idea to find an alternative for strtok( ). The 
following example will use an istreainbuf_itenitor (introduced earlier) to move the 
characters from one place (which happens to be an istream) to another (which happens to be 
a striDg), depending on whether the Standard C library function isalplia( ) is true: 





: C04:WordList2.cpp 




Eliminating strtok () 




nclude " . . /require.h" 




nclude <3tring> 




nclude <c3tring> 




nclude <3et> 




nclude <iostreain> 




nclude <fstream> 




nclude <iterator> 




ing namespace std; 


ir 


t main (int argc, char 




requireArgs (argc, 1) ; 




ifstream inlargv[l] ); 




assuredn, argv[l]); 




istreambuf_iterator<c 



iilword, word. begin ); 

// Find the first alpha character: 
while 1 ! isalpha C-p) S6 p != end) 

P + + ; 
// Copy until the first non-alpha cha 
while (isalpha C-p) 66 p != end) 

*ii + + = '■p + +; 
if (word. size () != 0) 
wordlist. insert (word) ; 
1 

// Output results: 
copy (wordlist. begin () , wordl i st . end ( ) , 
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I } ///:- 

This example was suggested by Nathan Myers, who invented the istreambuf.iterator and its 
relatives. This iterator extracts information character -by -character from a stream. Although 
the istreambufjlerator template argument might suggest to you that you could extract, for 
example, ints instead of char, that's not the case. The argument must beef some character 
type - a regular char or a wide character. 

After the file is open, an istreambufjterator called p is attached to the istream so characters 
can be extracted from it. The set<string> called wordlist will be used to hold the resulting 
words. 

The while loop reads words until the end of the input stream is found. This is delected using 
the default constructor for istreambuMte rater which produces the pasl-lhe-end iterator 
object end. Thus, if you want to lest to make sure you're not at the end of the stream, you 
simply say p != end. 

The second type of iterator that's used heie is the insert_ite rater, which creates an iterator 
that knows how to insert objects into a container. Here, the "container" is the string called 
word which, for the purposes of iiisert_iterator, behaves like a container. The constructor for 
insert_itera tor requires the container and an iterator indicating where it should start inserting 
the characters. You could also use a back_insert_iterater. which requires that 
have a push_back( ) (string does). 

After the while loop sets everything up, it begins by looking foi' the first alpha chaiat 
incrementing start until that character is found. Then it copies characters from one it 
the other, stopping when a non-alpha character is found. Each word, assuming it is n 
empty, is added to wordlist. 



StreamTokenizer: 

a more flexible solution 



llifi slill 1 spidil cii( [cm pared to lit jeDtrilitj cl slrlDk| ). What we'd like now is an 
actual replacement for strtek( ) so we're never tempted to use it. WordList2.cpp can be 
modified to create a class called StreamTokenizer that delivers a new token as a siring 
whenever you call next( ), according to the delimiters you give it upon construction (very 
similar to strtok( )): 

// : C04 : StreamTokeni zer . h 

// C++ Replacement for Standard C strtok () 

#ifndef STIffiAMTOKENIZER_H 

#define STIffiAMTOKENIZER_H 

#include <iostream> 
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It p, end; 

std: [String deli 

bool isDelimiter 



treamTokenizer (std::istreamS is, 
std: : string delim = " \t\n; 1)\"<>:{1 [ ] +- = 
'■.,/\\~!0123456789") : plis), endlltl)}, 
delimiters (delim) {] 

td::string next ( ) ; // Get next token 



#endif STREAMTOKENI ZER_H III:- 

The default delimiters for the StreamTokenizer ci 

characters, as before, but now you can choose different delimitei 

The implementation of next( ) looks similar to Wordlist2.cpp: 

// : C04 : StreamTokenizer . cpp !0} 

#include " StreamTokeni zer . h" 
using namespace std; 

string StreamTokeni zer :: next ( ) { 
string result; 
if Ip != end) { 



t words with only alpha 
o parse different tokens. 



while (isDelim 



suit. begin () ) ; 

er C-p) fifi p != . 



while 1 ! isDeli 
*ii++ = *p++ 



r l*p) &S p != end) 



The first non-delimiter is found, then ch; 
resulting string is returned. Here's ; 

I // : C04 : Token izeTest. 
//{LI StreamTokenizer 



s found, and the 
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// Test StreamTokeni 
#iiiclude "StreamToke 
#include ". ./require 
linclude <io3tream> 
linclude <fstream> 
linclude <set> 



requireArgs 


argc, 1); 




ifstrean 


in 


argvUl); 




assure (i 


n, argv[l]); 




StreamTokeni 


zer words 


( 


set<stri 


nq> 


wordlist; 




string v 


ord 






while ( (V 


ord 


= words. n 


e 


wordli 


St.l 


nsert (wor 


d 


// OutpL 


t re 


suits: 




copy (wordli. 


t. begin () 





I ) III:- 

Now the tool is more reusable than before, but it's still inflexible, because it can only work 
with an istream. This isn't as bad as it first seems, since a string can be turned into an 
istream via an istringstream. But in the next section we'll come up with the most general, 
reusable tokenizing tool, and this should give you a feeling of what "reusable" really means, 
and the effort necessary to create truly reusable code. 

A completely reusable tokenizer 

Since tlie STL cud li in in in d ilenritbin s ill revolvt nonnd itento rs, tli t n o st flei ib le 
so 111 tie n will Use If be m ileiiloi. ¥ o ii could lb ink cMlie Tokenlterator as an iterator that 
wraps itself around any other iterator that can produce characters. Because it is designed as an 
input iterator (the most primitive type of iterator) it can be used with any STL algorithm. Not 
only is it a useful tool in itself, the Tokenlterator is also a good example of how you can 
design your own iterators.'^ 

The Tokenlterator is doubly flexible: first, you can choose the type of iterator that will 
produce the char input. Second, instead of just saymg what characters represent the 
delimiters, Tokenlterator will use a predicate which is a function object whose operator( ) 
takes a char and decides if it should be in the token or not. Although the two examples given 



is ajiother example coached by Nalhaji Myers. 
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here have a static concept of what characters belong in a token, you could easily design your 
own function object to change its state as the characters are read, producing a more 
sophisticated parser. 

The following header file contains the two basic predicates Isalpha and Delimiters, along 
with the template for Tokenlterator: 

//: C04:TokenIterator.h 
#ifndef TOKENITERATOR_H 
Idefine TOKENITERATOR_H 
linclude <string> 
linclude <iterator> 
#incliide <algorithm> 
#incliide <cctype> 



compiler bug]] 



std: [String exclude; 
public: 

Delimiters!) {1 

: exclude (excl) {] 
bool operatorO (char c) { 

return exclude . find ( c ) == std::s 



template <class Inputlter, class Pred = Isalpha> 
class Tokenlterator: public std : : iterator< 

std : : input_iterator_tag, std: : string, ptrdif f _t> | 

Inputlter first; 

Inputlter last; 

std: :string word; 

Pred predicate; 
public: 

Tokenlterator (Inputlter begin, Inputlter end, 
Pred pred = Pred () ) 
: first (begin) , last (end) , predicate (pred) { 
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Tokenlterator 1) {} // End sentinel 
// Prefix increment: 
TokenlteratorS operator + + ( ) { 

word. resize (0); 

first = std: :find_if (first, last, predicate); 

while (first != last S& predicate (* fir st ) ) 
word += '■first + +; 

return '"this; 
1 

// Postfix increment 
class Proxy ! 

std: :string word; 
public: 

Proxy(const std::stringS w) : word(w) {] 

std::string operator'-O { return word; } 

]; 

Proxy operator++(int) { 
Proxy d (word) ; 



// Produce the actual value: 

std::string operatorM) const { return word 

std::string* operator->( ) const { 

return S (operator*" () ) ; 
1 

// Compare iterators: 
bool operator = = (const TokenlteratorS) { 

return word. size () == SE first == last; 
1 
bool operator != (const TokenlteratorS rv) { 

return ! ^-this == rv) ; 



jfendif // TOKENITERATOR_H / / / : - 

Tokenlterator is inherited from the std:: iterator template. It might appear that there's some 
kind of functionality that conies withstd::iterator, but it is purely a way of tagging an 
iterator so that a container that uses it knows what it's capable of. Here, you can see 
input_iterator_tag as a template argument — this tells anyone who asks that a Tokenlterator 

only has the capabilities of an input iterator, and cannot be used with algorithms requiring 
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more sophisticated iterators. Apart from the tagging, std::iterator doesn't do anything else, 
which means you must design all the other functionality in yourself. 



Tokenlterator may look a little strange at first, because the first constructor requires both a 
"begin" and "end" iterator as arguments, along with the predicate. Remember that this is a 
"wrapper" iterator that has no idea of how to tell whether it's at the end of its input source, so 
the ending iterator is necessary in the first constructor. The reason for the second (default) 
constructor is that the STL algorithms (and any algorithms you write) need a Tokenlterator 
sentinel to be the past-the-end value. Since all the information necessary to see if the 
Tokenlterator has reached the end of its input is collected in the first constructor, this second 
constructor creates a Tokenlterator that is merely used as a placeholder in algorithms. 

The core of the behavior happens in operator++. This erases the current value of word using 
string:: resize ( ), then finds the first character that satisfies the predicate (thus discovering the 
beginning of the new token) using find_if( ) (from the STL algorithms, discussed in the 
following chapter). The resulting iterator is assigned to first, thus moving first forward to the 
beginning of the token. Then, as long as the end of the input is not reached and the predicate 
is satisfied, characters are copied into the word from the input. Finally, the Tokenlterator 
object is returned, and must be dereferenced to access the new token. 

The postfix increment requires a proxy object to hold the value before the increment, so it can 
be returned (seethe operator overloading chapter for more details of this). Producing the 
actual value is a straightforward operator*. The only other functions that must be defined for 
an output iterator are the operator== and operator!= to indicate whether the Tokenlterator 
has reached the end of its input. You can see that the argument for operatoi^= is ignored - it 
only cares about whether it has reached its internal last iterator. Notice that operalor!= is 
defined in terms of opera tor ==. 

A good test of Tokenlterator includes a number of different sources of input characters 
including a streambufjterator, a char*, and a d eq u e<c ha r>:: iterator. Finally, the original 
Wordltst.cpp problem is solved: 



II 


C04:TokenIterator 




icl 


jde 


"Tokenlterat 




icl 


jde 


" . . /require. 




icl 


jde 


<fstream> 


#i 


icl 


jde 


<iostream> 


#i 


icl 


jde 


<vector> 




icl 


jde 


<deqiie> 


#i 


^cl 


jde 


<set> 


us 


ing 


na 


nespace std; 


in 


b m 


iin 


) ! 




Lfstre 


im inC'TokenI 


ass 


jre 


in, "Tokenit 



est.cpp") ; 
st.cpp") ; 
t(cout, "Xn"); 
<char> Isbit; 
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Isbit begin (in), isbEnd; 
Delimiters 

delimiters (" \t\n~; 1)\"<>:{1 [] +- = 6 *-#.,/ \ \ " ) ; 
Tokenlterator<lsblt, Oelimiters> 

wordlter (begin, isbEnd, delimiters) , 

end; 
vector<string> wordlist; 

copy (wordlter, end, back_inserter (wordlist) ) ; 
// Output results: 
copy(wordlist.begin(), wordlist . end () , out); 

// Use a char array as the source: 
char* cp = 

"typedef std : : istreambuf _iterator<char> It" ; 
Tokenlterator<char*, Oelimiters> 

charlter(cp, cp + strlen(cp), delimiters), 

end2; 
vector<string> wordlist2; 

copy (charlter, end2 , back_inserter (wordlist2) ) ; 
copy (wordlist2. begin , wordlist2 . end ( ) , out) ; 

// Use a deque<char> as the source: 

if stream in2 ( "TokenlteratorTest . cpp" ) ; 

deque<char> dc; 

copy(IsbIt(in2), IsbltO, back_inserter (dc ) ) ; 

TokenIterator<deque<char>: : iterator, Delimiters> 

dclter (dc. begin () , dc . end ( ) , delimiters) , 

end3 ; 
vector<string> wordlist3; 

copy (dclter, end3, back_inserter (wordlist3) ) ; 
copy(wordlist3.begin(), wordlist3 . end ( ) , out); 

// Reproduce the Wordlist. cpp example: 
if stream in3 ( "TokenlteratorTest . cpp" ) ; 
TokenIterator<IsbIt, Delimiters> 

wordIter2 (Isbit (in3) , isbEnd, delimiters) ; 
3et<string> wordlist4; 
while (wordIter2 != end) 

wordlist4. insert (*wordIter2++) ; 
copy (wordlist4 .begin ( ) , wordlist4 .end ( ) , out) ; 
} ///:- 
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When using an IstreambuMterator, you create one to attach to the istreain object, and one 
with the default constructor as the past-the-end marker. Both of these are used to create the 
Tokenlterator that will actually produce the tokens; the default constructor produces the faux 
Tokenlterator past-the-end sentinel (this is just a placeholder, and as mentioned previously is 
actually ignored). The TokeDllera tor produces strings that are inserted into a container 
which must, naturally, be a container of string -here a veetor<string> is used in all cases 
except the last (you could also concatenate the results onto a string). Other than that, a 
Tokenlterator works like any other input iterator. 



Stack 



I i t stack, along with the queue and priority _queue, are classified as adapters, which means 
they are implemented using one of the basic sequence containers: vector, list or deque. This, 
in my opinion, is an unfortunate case of confusing what something does with the details of its 
underlying implementation - the fact that these are called "adapters" is of primary value only 
to the creator of the library. When you use them, you generally don't care that they're 
adapters, but instead that they solve your problem. Admittedly there are times when it's useful 
to know that you can choose an alternate implementation or build an adapter from an existing 
container object, but that's generally one level removed from the adapter's behavior. So, 
while you may see it emphasized elsewhere that a particular container is an adapter, I shall 
only point out that fact when it's useful. Note that each type of adapter has a default ci 
that it's built upon, and this default is the most sensible impler 
won't need to concern yourself with the underlying impler 

The following example shows stack<string> implemented in the three possible ways: the 
default (which uses deque), with a vector and with a list: 

// : C04 :Stackl . cpp 

// Demonstrates the STL stack 

#include ".. /require . h" 

#include <iostream> 

#include <fstream> 

#include <stack> 

#include <list> 

#include <vector> 

#include <string> 

using namespace std; 

// Default: deque<string> : 
typedef stack<string> Stackl; 

typedef 3tack<string, vector<str ing> > Stack2; 
// Use a list<string>: 

typedef stack<str ing, list<string> > Stack3; 
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nt main(int argc, char* argv[]) { 
requireArgs (argc, 1); // File name is argument 
ifstream in(argv(l] ); 
assure (in, argv[l]); 

Stackl textlines; // Try the different versions 
// Read file and store lines in the stack: 
string line; 
while (getline (in, line) ) 

textlines. push (line + "\n"}; 
// Print lines from the stack and pop them: 
while ( ! textlines . empty ( ) ) { 

cout « textlines. topi), ■ 

textlines. pop 1); 
1 
1 1 1:- 

The top( ) and pop( ) operations will probably seem non-intuiiive if you've used other stack 
classes. When you call pop( ) it returns void rather than the top element that you might have 
expected. If you want the top element, you get a reference to it withlop(). It turns out this is 
more efficient, since a traditional pop( ) would have to return a value rather than a reference, 
and thus invoke the copy-constructor. When you're using a stack (or a priorily_qnene, 
described later) you can efficiently refer to top( ) as many times as you want, then discard the 
top element explicitly using pop( ) (perhaps if some other term than the familiar "pop" had 
been used, this would have been a bit clearer). 

The stack template has a very simple interface, essentially the member functions you see 
above. It doesn't have sophisticated forms of initialization or access, but if you need that you 
can use the underlying container that the stack is implemented upon. For example, suppose 
you have a function that expects a stack interface but in the rest of your program you need the 
objects stored in a list. The following program stores each line of a file along with the leading 
number of spaces in that line (you might imagine it as a starting point for performing some 
kinds of source-code reformatting): 

ck2.cpp 





C04: 




Conve 




iclude 




iclude 




iclude 




iclude 




iclude 




iclude 


us 


ng na 


// 


Expec 
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emplate<class Stk> 
oid stackOut (StkE s, ostre 
while ( ! s .empty () ) ! 

s.popO; 



string line; // Without leading spaces 

int Ispaces; // Number of leading spaces 
public: 

Line (string s) : line (s) { 

Ispaces = line.find_first_not_of (' Mr- 
line = line. substr (Ispaces) ; 

} 

friend ostream6 

operator<< (ostreamS os, const Lines 1) { 
for(int i = D; i < 1. Ispaces; i + +) 



// Other fu. 



int main (int argc, char*" argv [ ] ) { 

requireArgs (argc, 1); // File name is 

ifstream in(argv[l] ) ; 

assure(in, argv(l]); 

list<Line> lines; 

// Read file and store lines in the li 

string s; 

while(getline (in, s) ) 
lines. push_front(s); 

// Turn the list into a stack for prin 

stack<Line, list<Line> > stk(lines); 

stackOut (stk) ; 
} ///:- 



The function ihat requires tlie stack interface just sends each top( ) object to an ostream and 
then removes it by calling pop( ). The Line class determines the number of leading spaces, 
then stores the contents of the line without the leading spaces. The ostream operator« re- 
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inserts the leading spaces so tlie line prints properly, but you can easily change the number of 
spaces by changing the value of Ispaces (the member functions to do this are not shown here). 

In iiiain( ), the input file is read into a list<Line>, then a stack is wrapped around this list so 
it can be sent to stackOut( ). 

You cannot iterate through a stack; this emphasizes that you only want to perform stack 
operations when you create a stack. You can get equivalent "stack" functionality using a 
vector and its back( ), push_back( ) and pop_back< ) methods, and then you have all the 
additional functionality of the vector. Stackl.cpp can be rewritten to show this: 

// : C04 :Stack3 . cpp 

// Using a vector as a stack; modified Stackl.cpp 

linclude ".. /require . h" 

linclude <iostream> 

linclude <fstream> 

linclude <vector> 

linclude <string> 



requireArgs (argc, 1) ; 
ifstream in(argv[l]); 
assure(in, argv[l]); 
vector<string> textlines; 
string line; 
while (getline (in, line)) 

textlines. push_back (line + "\n"); 
while ( ! textlines . empty ( ) ) | 

cout « textlines. backO; 

textlines. pop_back(); 
1 
} ///:- 

You'll see this produces the same output as Stackl.cpp. but you can now perform vector 
operations as well. Of course, list has the additional ability to push things at the front, but it's 
generally less efficient than using pnsh_back( ) with vector. (In addition, deque is usually 
more efficient than list for pushing things at the front). 



queue 



I 1 1 queue is a restricted form of a deque - you can only enter elements at one end, and pull 
them off the other end. Functionally, you could use a deque anywhere you need a queue, and 
you would then also have the additional functionality of the deque. The only reason you need 
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to use a queue rather than a deque, then, is if you want to emphasize that you will only be 
performing queue -like behavior. 

The queue is an adapter class like stack, in that it is built on top of another sequence 
container. As you might guess, the ideal implementation for a qneue is a deque, and that is 
the default template argument for the queue; you'll rarely need a different implementation. 

Queues are often used when modeling systems where some elements of the system are 
waiting to be served by other elements in the system. A classic example of this is the "bank- 
teller problem," where you have customers arriving at random intervals, getting into a line, 
and then being served by a set of tellers. Since the customers arrive randomly and each take a 
random amount of time to be served, there's no way to deterministic ally know how long the 
line will be at any time. However, it's possible to simulate the situation and see what happens. 

A problem in performing this simulation is the fact that, in effect, each customer and teller 
should be run by a separate process. What we'd like is a multithreaded environment, then 
each customer or teller would have their own thread. However, Standard C++ has no model 
for multithreading so there is no standard solution to this problem. On the other hand, with a 
little adjustment to the code it's possible to simulate enough multithreading to provide a 
satisfactory solution to our problem. 

Multithreading means you have multiple threads of control running at once, in the same 
address space (this differs from multitasking, where you have different processes each running 
in their own address space). The trick is that you have fewer CPUs than you do threads (and 
very often only one CPU) so to give the illusion that each thread has its own CPU there is a 
time-slicing mechanism that says "OK, current thread — you've had enough time. I'm going to 
stop you and go give time to some other thread." This automatic stopping and starting of 
threads is called pre-emptive and it means you don't need to manage the threading process at 
all. 

An alternative approach is for each thread to voluntarily yield the CPU to the scheduler, 
which then goes and finds another thread that needs running. This is easier to synthesize, but 
it still requires a method of "swapping" out one thread and swappmg in another (this usually 
involves saving the stack frame and using the standard C library functions setjmp( ) and 
loDgjmpO; see my article in the (XX) issue of Computer Language magazine for an 
example). So instead, we'll build the time-slicing into the classes in the system. In this case, it 
will be the tellers that represent the "threads," (the customers will be passive) so each teller 
will have an infinite -looping run( ) method that will execute for a certain number of "time 
units," and then simply return. By using the ordinary return mechanism, we eliminate the need 
for any swapping. The resulting program, although small, provides a remarkably reasonable 

// : CO 4 :BankTeller .cpp 

// Using a queue and simulated multithreading 

// To model a bank teller system 

#include <iostream> 

#include <queue> 
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♦include <list> 
linclude <cstdlib> 
linclude <ctime> 



public: 



stomerO : serviceTime (0 ) {] 
stomerdnt tm) : serviceTime ( tm) {] 
t getTimeO ! return serviceTime; 1 
id setTimednt newtime) { 



friend ostreai 
operator<< (os 



class Teller | 

queue<Customer>S customers; 
Customer current; 
static const int slice = 5; 
int ttime; // Time left in slice 
bool busy; // Is teller serving a customer? 
public: 

Teller (queue<Customer>E cq) 

: customers (cq), ttime (0), busy (false) {) 
Tellers operator- (const Tellers rv) { 
customers = rv . customers ; 
current = rv. current; 






bool isBusyO ! return busy; 1 
void run (bool recursion = false) | 
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current. setTime (servtime) ; 
busy = true; // Still worki. 



if 1 ! customers .empty () ) { 

customers .pop ; // Remove it 

busy = true; 

run (true); // Recurse 



// Done with current, set to empty: 
current = Customer (0) ; 

return; // No more time in this slic 



); 

class CustomerQ : public queue<Customer> { 
public: 

friend ostreamS 

operator<< (ostreamS os, const CustomerQS cd) { 
copy (cd.c. begin , cd.c.endl), 

ostream_iterator<Customer> (os, "") ) ; 



nt mainl) { 
CustomerQ customers; 
list<Teller> tellers; 

typedef list<Teller> :: iterator Telllt; 
tellers. push_back (Teller (customers) ) ; 
srand(time (0) ) ; // Seed random number ge: 
while (true) { 

// Add a random number of customers to 
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forlint i = 0; i < rand ( ) % 5; i++) 

customers.push (Customer (rand 1) % 15 + 1 ) ) ; 



// Have the tellers service the queue: 
forlTelllt i = tellers . begin () ; 



« customers « endl; 
// If line is too long, add another teller 
if (customers. sizeO / tellers . size ( ) > 2) 

tellers. push_back (Teller (customers) ) ; 
// If line is short enough, remove a telle 
if (tellers. sizeO > 1 SS 

customers. sizeO / tellers . si ze ( ) < 2) 

for(TellIt i = tellers . begin () ; 
i != tellers. endO ; i + +) 
if (! (*i) .isBusyO) { 

break; // Out of for loop 



} ///:- 

Each customer requires a certain amount of service time, which is the number of time units 
that a teller must spend on the customer in order lo serve that customer's needs. Of course, the 
amount of service time will be different for each customer, and will be determined randomly. 
In addition, you won't know how many customers will be arriving in each interval, so this 
will also be determined randomly. 

The Customer objects are kept in a queue<Custoiner>, and each Teller object keeps a 
reference to that queue. When a Teller object is finished with its current Customer object, 
that Teller will get another Customer from the queue and begin working on the new 
Customer, reducing the Customer's service time during each time slice that the Teller is 
allotted. All this logic is in the run( ) member function, which is basically a three-way if 
statement based on whether the amount of time necessary to serve the customer is less than, 
greater than or equal to the amount of time left in the teller's current time slice. Notice that if 
the Teller has more time after finishing with a Customer, it gets a new customer and recurses 
into itself. 

Just as with a stack, when you use a queue, it's only a qoeue and doesn't have any of the 
other functionality of the basic sequence containers. This includes the ability to get an iterator 
in order to step through the stack. However, the underlying sequence container (that the 
queue is built upon) is held as a protected member inside the queue, and the identifier for 
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this member is specified in the C++ Standard as 'c", which means that you can inherit from 
queue in order to access the underlying implementation. The CustomerQ class does exactly 
that, for the sole purpose of defining an ostream operator« that can iterate through the 
queue and print out its members. 

The driver for the simulation is the infinite while loop in inain( ). At the beginning of each 
pass through the loop, a random number of customers are added, with random service times. 
Both the number of tellers and the queue contents are displayed so you can see the state of the 
system. After running each teller, the display is repeated. At this point, the system adapts by 
comparing the number of customers and the number of tellers; if the line is too long another 
teller is added and if it is short enough a teller can be removed. It is in this adaptation section 
of the program that you can experiment with policies regarding the optimal addition and 
removal of tellers. If this is the only section that you're modifying, you jnay want to 
encapsulate policies inside of different objects. 



Priority queues 



I b 1 1 ) g II push( ) an object onto a prior ity_queue, that object is sorted into the queue 
according to a function or function object (you can allow the default less template to supply 
this, or provide one of your own). The priority_queue ensures that when you look at the 
top( ) element it will be the one with the highest priority. When you're done with it, you call 
pop( ) to remove it and bring the next one into place. Thus, the priority _queue has nearly the 
same interface as a stack, but it behaves differently. 

Like stack and queue, priority_ queue is an adapter which is built on top of one of the basic 
sequences - the default is vector. 

It's trivial to make a priority_ queue that works with ints: 

// : CO 4 : Pri or ity Queue 1 . cpp 
#include <iostream> 

#include <cstdlib> 
#include <ctime> 

int mainl) { 

priority_queue<int> pqi ; 

3rand(time (0) ) ; // Seed random number generator 

for(int i = 0; i < 100; i++) 

pqi.pu3h(rand() % 2 5); 
while 1 Ipqi .empty () ) ! 

cout « pqi. top « ' '; 

pqi. pop 1) ; 
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I 1 ///:- 

This pushes into the priority_queue 100 random values from to 24. When you run this 
program you'll see that duplicates are allowed, and the highest values appear first. To show 
how you can change the ordering by providing your own function or function object, the 
following program gives lower-valued numbers the highest priority: 

// : CO 4 :PriorityQueue2 . cpp 
// Changing the priority 
linclude <iostream> 
linclude <queue> 
linclude <cstdlib> 

using namespace std; 



priority_queue<int, vector<int>, Reverse> pqi ; 

// Could also say: 

// priority_qiieiie<int, vector<int>, 

// greater<int> > pqi; 

srandltime(O) ) ; 

for(int i = 0; i < 100; i++) 

pqi.pu3h(rand() % 25) ; 
while ( ! pqi . empty () ) { 

cout « pqi. top 1) « ' '; 

pqi. pop ; 
) 

Although you can easily use the Standard Library greater template to produce the predicate, 1 
went to the trouble of creating Reverse so you could see how to do it in case you have a more 
complex scheme for ordering your objects. 

If you look at the description for p no rity_ queue, you see that the constructor can be handed a 
"Compare" object, as shown above. If you don't use your own "Compare" object, the default 
template behavior is the less template function. You might think (as 1 did) that it would make 
sense to leave the template instantiation as priority. queue<iDt>, thus using the default 
template arguments of vector<int> and less<int>. Then you could inherit a new class from 
less<int>, redefine operator! ) ^'^'^ hand an object of that type to the priority. queue 
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". I tried this, and got it to compile, but tlie resulting program produced the s; 
less<int> behavior. The answer lies in the less< > template: 

template <class T> 

struct less : biiiary_f unction<T, T, bool> { 

// Other stuff. . . 

bool operator 1) (const TS x, const TS y) const { 



The operator( ) is not virtual, so even though the constructor takes your subclass of 
less<int> by reference (thus it doesn't slice it down to a plain less<inl>), when operator( ) is 
called, it is the base-class version that is used. While it is generally reasonable to expect 
ordinary classes to behave polymorphic ally, you cannot make this assumption when using the 
STL. 

Of course, a priority_queue of int is trivial. A more interesting problem is a to-do list, where 
each object contains a string and a primary and secondary priority value: 

// : C04 :PriorityQueue3 . cpp 

// A more complex use of prior ity_queue 

#include <queue> 
#include <string> 

using namespace std; 

class ToDoItem { 
char primary; 

string item; 
public: 

ToDoItem (string td, char pri ='A', int sec =1) 

: item(td), primary (pri ) , secondary (sec ) {} 
friend bool operator< ( 

const ToDoItemS x, const ToDoItemS y) ( 

if (x. primary > y. primary) 

if (x. primary == y. primary) 

if (x. secondary > y. secondary) 

1 

friend ostreamS 



ToDoItem£ td) { 



Chapter 15: Multiple Iiiliei 



prior 


ty_queue<ToDoIt 


em> toDoList; 


toDoList .push (ToDoIte 


mC'Empty trash", 'C, 4) 


toDoList .push (ToDoIte 


mC'Feed dog", 'A', 2) ) ; 


toDoList .push (ToDoIte 


m("Feed bird", 'B', 7) ) ; 


toDoList .push (ToDoIte 


[n("Mow lawn", 'C, 3) ) ; 


toDoList .push (ToDoIte 


[n("Water lawn", 'A', 1) ) 


toDoList .push (ToDoIte 


m("Feed cat", 'B', 1) ) ; 


while 


! toDoList .empty 


) 1 



ToDoIteni'soperator< must 
than that, everything happens 



-member function for il 
jly. The output is: 



C4 



Feed bi 
Mow law: 



Note that you cannot iterate through a priority_queue. However, it is possible to emulate the 
behavior of a priority_queue using a vector, thus allowing you access to that vector. You 
can do this by looking at the implementation of priority_queue, which uses make_heap( ), 
push_heap( ) and pop_heap( ) (they are the soul of the priority_queue; in fact you could say 
that the heap is the priority queue and priority_queue is just a wrapper around it). This turns 
out to be reasonably straightforward, but you might think that a shortcut is possible. Since the 
container used by priorily_queue is protected (and has the identifier, according to the 
Standard C++ specification, named c) you can inherit a new class which provides access to 
the underlying implementation: 



//: C04:PriorityQueu 


e4.cpp 




// Manipulating the 


underlyi 


ig impl 


#include <iostream> 






#include <queue> 






#include <cstdlib> 






jfinclude <ctime> 
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class PQI : public prior 
public: 

vector<int>S impl ( ) { 



PQI pqi; 

srand(timelO) ) ; 

for(int i = 0; i < 100; i++) 

pqi.push(rand() % 25); 
copy (pqi . impl ( ) . begin ( ) , pqi . impl ( ) . end ( ) , 

03tream_iterator<int> (cout, " ") ) ; 
cout « endl; 
while (! pqi .empty () ) { 

cout « pqi. top 1) « ' '; 

pqi. pop 1) ; 
1 
} ///:- 

However, if you run this program you'll discover that the vector doesn't contain the items in 
the descending order that you get when you call pop( ), the order that you want from the 
priority queue. It would seem that if you want to create a vector that is a priority queue, you 
have to do it by hand, like this: 

// : C04 :PriorityQueue5.cpp 

// Building your own priority queue 

linclude <iostream> 

linclude <queue> 

#include <cstdlib> 

using namespace std; 

template<class T, class Compare> 
class PQV : public vector<T> { 

Compare comp; 
public: 

PQV (Compare cmp = Compare!)) : comp(cmp) { 
make_heap (begin () , end ( ) , comp); 

1 

const TS topi) { return f ront ( ) ; ] 

void push (const TS x) { 
push_back (x) ; 
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push_heap (bei 

oid popl) { 
pop_heap (beg 
pop_back 1 ) ; 



int mainl) { 

PQV<iiit, less<int> > pqi ; 

for(int i = 0; i < 100; i++) 

pqi. push (randl) % 25) ; 
copy (pqi . begin (), pqi.endl), 

O3tream_iterator<int> (cout, " ") ) ; 
cout « endl; 
while (! pqi .empty () ) { 

cout « pqi. top 1) « ' ' ; 

pqi. pop 1) ; 
1 
( III-." 

But this program behaves in the same way as the previous one! What you are seeing in the 
underlying vector is called a heap. This heap represents Ihe tree of the priority queue (stored 
in the linear structure of the vector), but when you iterate through it you do not get a linear 
priority-queue order. You might think that you can simply call sort_heap( ), but that only 
works once, and then you don't have a heap anymore, but instead a sorted list. This means 
that to go back to using it as a heap the user must remember to call niake_heap( ) first. This 
can be encapsulated into your custom priority queue: 

// : CO 4 :PriorityQueue6.cpp 
#include <iostream> 
#include <queue> 
#include <algorithni> 
#include <cstdlib> 
#include <ctime> 
using namespace std; 

template<class T, class Compare> 
class PQV : public vector<T> ! 
Compare comp; 
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// Turn it back into a heap: 
make_heap (begin ( ) , end ( ) , comp ) ; 
sorted = false; 



public: 

PQV (Compare cmp = Compare () ) : comp(cmp) { 

make_heap (begin ( ) , end ( ) , comp ) ; 

sorted = false; 
1 
const Tfi top { 

assureHeap () ; 

return f ront ( ) ; 
} 
void push (const Tfi x) { 

// Put it at the end: 

push_back(x) ; 

// Re-adjust the heap: 

push_heap (begin () , end ( ) , comp); 
1 
void popO { 

assureHeap () ; 

// Move the top element to the last posit 

pop_heap (begin , end ( ) , comp); 

// Remove that element: 

pop_back() ; 
1 
void sort { 

ifdsorted) { 

sort_heap (begin () , end () , comp); 
reverse (begin , end ( ) ) ; 
sorted = true; 



nt mainO { 
PQV<int, less<int> > pqi ; 
srand(time (0) ) ; 
for(int i = 0; i < lOD; i++) { 

pqi. push (randO % 25) ; 

copy (pqi. begin (), pqi.end(). 
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pqi.sortl); 

copy (pqi . begin (), pqi.endl), 

while ( !pqi .empty () ) { 

cout « pqi. top 1) « ' '; 
pqi. pop 1) ; 



If sorted is true, then the vector is not organized as a heap, but instead as a sorted sequence. 
assureHeap( ) guarantees thai it's pul back into heap form before performing any heap 
operations on it. 

The first for loop in nmiii( ) now has the additional quality that it displays the heap as it's 
being built. 

The only drawback to this solution is that the user must remember to call sort( ) before 
viewing it as a sorted sequence (although one could conceivably override all the methods that 
produce iterators so that they guarantee sorting). Another solution is to build a prioiity queue 
that is not a veclor, but will build you a vector whenever you want one: 





: C04:PriorityQueu 


e7.cpp 








// A priority queue 


that will har 


d yo 


u a V 


ector 




nclude <iostream> 












nclude <queue> 












nclude <algorithm> 












nclude <cstdlib> 










u 


nclude <ctinie> 

ing namespace std; 










template<class T, cl 


ass Compare> 








class PQV 1 












vector<T> v; 












Compare comp; 










P 


blic: 












// Don't need to c 


all make_heap () ; 


it's 


empty: 




PQV (Compare cmp = 


Compare () ) : 


comp 


(cmp) 


!} 




void push (const TS 


X) ! 










// Put it at the 


end: 










v.push_back(x) ; 












// Re-adjust the 


heap: 
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push_heap (v.begin(), v. end ( ) , comp ) ; 
1 
void popl) { 

// Move the top element to the last pos 

pop_heap (v.beginl), v.endl), comp) ; 

// Remove that element: 

v.pop_back(); 
} 

const TE topi) { return v. front 1); } 
bool empty 1) const { return v. empty 1); } 

typedef vector<T> TVec; 
TVec vector () ! 

TVec r(v.beginl), v.endl)); 

// It's already a heap 

sort_heap Ir .begin () , r .end 1) , comp) ; 

II Put it into priority-queue order: 

reverse Ir. begin 1) , r . end 1 ) ) ; 



int mainl) { 

PQV<int, less<int> > pqi; 
3rand(timelO) ); 
for(int i = 0; i < 100; i++) 
pqi.push(rand() % 25) ; 

copy (v. begin 0, v.endl), 

O3tream_iterator<int> Icout, " ") ) ; 

cout « "Xn \n"; 

while (! pqi .empty () ) | 

cout « pqi. top 1) « ' '; 

pqi. pop 1) ; 
1 
( ///:- 

PQV follows the same form as the STL's p no rity_ queue, but has the additional member 
¥ector(), which creates a new vector that's a copy of the one in PQV (which means that it's 
already a heap), then sorts it (thus it leave's PQV's vector untouched), then reverses the ordei 
so that traversing the new vector produces the same effect as popping the elements from the 
priority queue. 
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You may observe that the approach of inheritmg from prioritj'_queue used ji 
Priority Queue4.cpp could be used with the above technique to produce mon 

//: C04:PriorityQueue8.cpp 

// A more compact version of PriorityQueue7 . cpp 

linclude <algorithm> 
linclude <cstdlib> 
linclude <ctime> 



template<class T> 

class PQV : public prior ity_queue<T> { 

public: 

typedef vector<T> TVec; 
TVec vector | 

TVec r (c. begin 0, c . end () ) ; 

// c is already a heap 

sort_heap (r . begin ( ) , r . end ( ) , comp ) ; 

// Put it into priority-queue order: 

reverse (r. begin 1) , r . end ( ) ) ; 



nt mainO { 
PQV<int> pqi; 
3rand(timelO) ) ; 
for(int i = 0; i < 100; i++) 

pqi.push(randl) % 25) ; 
const vector<int>S v = pqi . ve. 
copy (v. begin 0, v.endl), 

ostream_iterator<int> (cout, 

cout « "Xn \n"; 

while 1 !pqi .empty () ) { 

cout « pqi. top 1) « ' '; 

pqi. pop 1) ; 



The brevity of this solution makes it the simplest and most desirable, plus it's guaranteed that 
the user will not have a vector in the unsoried state. The only potential problem is that the 
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vector( ) member function returns the vector<T> by value, which might c: 
overhead issues with complex values of the parameter type T. 



Holding bits 



.i pfC 



ktuiH il'i MuerlM'Mp lie nil i.piri i"n,niHl,l! Ilfl.l' hiiiMin ltd spin ■ 
lif (lifi iDii'ri Ujiij K f [({[>»- 11 ! I iiii'Htsniki lit (lit [tjiilMi ii (Mil, O! trtD 
hiHttii il - Hit lit Hiin. A H jtl C ) d i '1 It 1 ) n si )' ObOlOllOl, which is the obvious 
solution for a language close to the hardware. 

Although there's still no native binary representation In C-H-, things have improved with the 
addition of two classes: bitset and vector<bool>, both of which are designed to manipulate a 
groupofon-off values. The primary differences between these types are: 

1. The bitset holds a fixed number of bits. You establish the quantity of bits m the bilset 
template argument. The vector<bool> can, like a regular vector, expand dynamically to 
hold any number of bool values. 

2. The bitset is explicitly designed for performance when manipulating bits, and not as a 
"regular" container. As such, it has no iterators and it' s most storage-efficient when It 
contains an integral number of long values. The vector<lM>ol>, on the other hand, is a 
specialization of a vector, and so has all the operations of a normal vector - the 
specialization is just designed to be space-efficient for bool. 

There is no trivial conversion between a bitset and a vector<bool>, which implies that the 
two are for very different purposes. 



bitset<n> 



The template fort ilsH accepts an integral template argument which is the number of bits to 
represent. Thus, bitset<10> Is a different type than bitset<20>, and you cannot perform 
comparisons, assignments, etc. between the two. 

A bitset provides virtually any bit operation that you could ask for, in a very efficient form. 
However, each bitset is made up of an integral number of longs (typically 32 bits), so even 
though it uses no more space than it needs, it always uses at least the size of a long. This 
means you'll use space most efficiently if you increase the sizeof your bitsets in chunks of 
the number of bits in a long. In addition, the only conversion //-om a bilset to a numerical 
value is to an unsigned long, which means that 32 bits (if your long is the typical size) is the 
most flexible form of a bitset. 
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The following example tests almost all the functionality of the bitset (the missing operations 
are redundant or trivial). You'll see the description of each of the bitsel outputs to the right of 
the output so that the bits all line up and you can compare them to the source values. If you 
still don't understand bitwise operations, running this program should help. 

1 // : C04 :BitSet.cpp 

// Exercising the bitset class 

linclude <iostream> 

linclude <bitset> 

#include <cstdlib> 

#include <ctime> 

linclude <climits> 

linclude <string> 

using namespace std; 

const int sz = 32; 

typedef bitset<sz> BS; 

template<int bits> 
bitset<bits> randBitsetl) { 
bitset<bits> r(randl)); 

for(int i = 0; i < bits/16 - 1; i++} { 
r «= 16; 

// "OR" together with a new lower 16 bits: 
r 1= bitset<bits>(rand() ); 



nt mainl) { 
srandltime (0) ) ; 
cout « "sizeof (bitset<16>) = " 

« sizeof (bitset<16>) « endl ; 
cout « "sizeof (bitset<32>) = " 

« sizeof (bitset<32>) « endl; 
cout « "sizeof (bitset<48>) = " 

« sizeof (bitset<48>) « endl; 
cout « "sizeof (bitset<54>) = " 

« sizeof (bitset<54>) « endl; 
cout « "sizeof (bitset<55>) = " 

« sizeof (bitset<55>) « endl; 
BS a(randBitset<sz>() ) , b (randBit 
// Converting from a bitset: 
unsigned long ul = a . to_ulong ( } ; 
string s = b . to_str ing ( ) ; 
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cbits = "111011010110111"; 

< "char* cbits = " « cbits < 

< BS (cbits) << " [BS (cbits)]" 

< BS (cbits, 2) 

[BS (cbits, 2)]" « endl; 

< BS (cbits, 2, 11) 

[BS (cbits, 2, 11)]" « endl; 



[a]" 
[b]" 



cout « (a fi b) « 
cout << (BS (a) S= b 
// Bitwise OR: 
cout « (a I b) « 
cout « (BS (a) 1= b 
// Exclusive OR: 



(a 



b) 



[a S b] " « endl; 

« " [a 6= b] " « endl; 



b] " « endl; 
[a 1= b]" « endl; 



b]" 



cout « (BSla) "= b) « " [a "= b] " 
cout « a « " [a]" « endl; // For 
// Logical left shift (fill with ze 
cout « (BS(a) «= sz/2) 

« " [a «= (sz/2)]" « endl; 



(a 



z/2) 



ndl; 



cout « a « " [a]" « endl; // For 
// Logical right shift (fill with z. 
cout « (BS(a) »= sz/2) 

« " [a »= (sz/2)]" « endl; 



(a 



sz/2) 
[a]" 



ndl; 

ndl; // For 



BS (a) .set () 



t(i)) ! 

BS (a) .set (i) 



ak; // Just do i 



ample of thi 



1 



cout « BS (a) .reset 1) « " [ a . reset ( ) ] " < 
for lint j = 0; j < sz; j + +) 
if la.test(j)) ! 

cout « BS (a) .reset ( j) 

« " [a.resetl" « j «" ) ] " « end 
break; // Just do one example of thi 
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BS(a) .flipO « '■ [a.flipO]" < 

~a « '■ [~a]" « endl; 

a « '■ [a]" « endl; // For ref 

BS (a) .flipll) « '■ [a.flipll)]" 

c « " [c]" « endl; 



cout « "c. count 1) = " « c. count 1) « 


endl ; 


cout « "cany () = " 




« (c.anyO ? "true" : "false") « e 


idl; 


cout « "c.none () = " 




« (c.noneO ? "true" : "false") « 


5ndl ; 


c[l] .flipO; c[2] .flipl); 




cout « c « " [c]" « endl; 




cout « "c. count 1) = " « c. count 1) « 


endl; 


cout « "c.anyO = " 




« (C.anyO ? "true" : "false") « e 


idl; 


cout « "c.none () = " 




« (c.noneO ? "true" : "false") « 


5ndl ; 


// Array indexing operations: 




c. reset () ; 




for(int k = 0; k < c.sizel); k++) 




if(k % 2 == 0) 




c[k] .flipl); 




cout « c « " [c]" « endl; 




c. reset () ; 




// Assignment to bool : 




for(int ii = 0; ii < c.sizel); ii++) 




c[ii] = (randO % 100) < 25; 




cout « c « " Id" « endl; 




// bool test: 




if(c[l] == true) 




cout « "c(l] == true"; 





} ///:- 

To generate interesting random bitsets, the randBitset( ) function is created. The Standard C 
rand( ) function only generates an int, so tliis function demonstrates operator«= by sliifting 
each 16 random bits to the left until the bitset (which is templatized in this function for size) 
is full. The generated number and each new 1 6 bits is combined using the operatorl=. 

The first thing demonstrated in main( ) is the unit size of a bitset. If it is less than 32 bits, 
sizeof produces 4 (4 byles = 32 bits), which is the size of a single long on most 
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implementations. If it's between 32 and 64, it requires two longs, greater tlian 64 requires 3 
longs, etc. Thus you make the best use of space if you use a bit quantity that fits in an integral 
number of longs. However, notice there's no extra overhead for the object - it's as if you 
were hand-coding to use a long. 

Another clue that bilset is optimized for longs is that there is a to_Dlong( ) member function 
that produces the value of the bitset as an unsigned long. There are no other numerical 
conversions from bitset, but there is a to_string( ) conversion that produces a string 
containing ones and zeros, and this can be as long as the actual bilset. However, using 
bilset<32> may make your life simpler because of to_uiong( ). 

There's still no primitive format for binary values, but the next best thing is supported by 
bilset: a string of ones and zeros with the least-significant bit (Isb) on the right. The three 
constructors demonstrated show taking the entire string (the char array is automatically 
converted to a string), the string starting at character 2, and the string from character 2 
through 1 1 . You can write to an ostream from a bitset using operator« and it comes out as 
ones and zeros. You can also read from an istream using operator» (not shown here). 

You'll notice that bitset only has three non-member operators: and (&), or (I) and exclusive- 
or ("). Each of these create a new bitsel as their return value. All of the member operators opt 
for the more efficient &=, l=, etc. form where a temporary is not created. However, these 
forms actually change their lvalue (which is a in most of the tests in the above example). To 
prevent this, 1 created a temporary to be used as the lvalue by invoking the copy -constructor 
on a; this is why you see the form BS(a). The result of each test is printed out, and 
occasionally a is reprinted so you can easily look at it for reference. 

The rest of the example should be self-explanatory when you run it; if not you can find the 
details in your compiler's documentation or the other documentation mentioned earlier in this 



vector<bool> 



V t c lo r< b M l> is a specialization of the vector template. A normal bool variable requires at 
least one byte, but since a bool only has two states the ideal implementation of vector<bool> 
is such that each bool value only requires one bit. This means the iterator must be specially- 
defined, and cannot be a bool*. 

The bit-manipulation functions for vector<booi> are much more limited than those of bitsel. 
The only member function that was added to those already in vector is flip{ ), to invert all the 
bits; there is no set( ) or reset( ) as in bitset. When you use operator[ ], you get back an 
object of type vector<booi>::reference, which also has a f1ip( ) to invert that individual bit. 

// : CO 4 :VectorOfBool . cpp 

// Demonstrate the vector<bool> specialization 

#include <iostream> 

#include <sstream> 

#include <vector> 
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^include <bit 
using namespa. 



vector<bool> vb ( 1 , true); 
vector<bool>: :iterator it; 

for(it = vb.beginl); it != vb . end ( ) ; it+ 

cout « endl; 
vb.push_back (false) ; 

O3tream_iterator<bool> outlcout, ""); 
copy (vb.beginO , vb . end ( ) , out); 
cout « endl; 

bool ab[] = I true, false, false, true, 
true, true, false, false, true }; 

vb. assign lab, ab + s i zeof (ab ) / s i zeof (boo 

copy (vb. begin 1) , vb . end ( ) , out); 

cout « endl; 

vb.flipO; // Flip all bits 

copy (vb. begin 1) , vb . end ( ) , out); 

cout « endl; 

forlint i = 0; i < vb.sizel); i++) 

vb[i] = 0; // (Equivalent to "false") 
vb[4] = true; 
vb[5] = 1; 

vb[7] .flipO; // Invert one bit 
copy (vb. begin , vb . end ( ) , out); 
cout « endl; 
// Convert to a bitset: 
ostringstream os; 
copy (vb . begin ( ) , vb . end ( ) , 

03tream_iterator<bool> (os, "") ) ; 
bit3et<10> bs (os.str ) ; 
cout « "Bitset :\n" « bs « endl; 
} ///:- 



The last part of this example takes a vector<bool> and converts it to a bitset by first turning it 
into a string of ones and zeros. Of course, you must know the size of the bitset at compile- 
time. You can see that this conversion is not the kind of operation you'll want to do on a 
regular basis. 
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Associative containers 



The set, map, multiset and inultiinap are called associative containers because they 
associate keys with values. Well, at least maps and multimaps associate keys to values, but 
you can look at a set as a map that has no values, only keys (and they can in fact be 
implemented this way), and the same for the relationship between multiset and multimap. 
So, because of the structural similarity sets and multisets are lumped in \ 



The most important basic operations with associative containers are putting things in, and in 
the case of a set, seeing if something is in the set. In the case of a map, you want to first see i 
a key is in the map, and if it exists you want the associated value for that key to be returned. 
Of course, there are many variations on this theme but that's the fiind a mental concept. The 
following example shows these basics: 

// : CO 4 lAssociativeBasics . cpp 

// Basic operations with sets and maps 

#include "Noisy. h" 

linclude <iostream> 

#include <set> 

#include <map> 

int mainO ! 

Noisy na[] = ! Noisy () , Noisy (), Noisy (), 

NoisyO, NoisyO, NoisyO, NoisyO ] ; 
// Add elements via constructor: 

3et<Noisy> ns (na, na+ sizeof na/ s i zeof (Noi sy ) ) ; 
// Ordinary insertion: 

cout « endl; 

// Check for set membership: 

cout « "ns. count (n)= " « ns.count(n) « endl; 

if (ns.find(n) != ns . end () ) 

// Print elements: 
copy(ns.begin(), ns . end ( ) , 
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nm[i]; // Automatically makes pairs 

aout « "Xn \n"; 

for(int j = 0; j < nm.sizel); j++) 

cout « "nmC « j «'■] = " « nm[j] « endl; 

cout « "Xn \n"; 

nm[10] = n; 

cout « "Xn \n"; 

nm. insert lmake_pair (47, n)); 

cout « "Xn \n"; 

cout « "Xn nm. count (10)= " 

« nm. count (10) « endl; 
cout « "nm.countdD- " 

« nm.count(ll) « endl; 
map<int, Noisy> :: iterator it = nm.findlS); 
if (it != nm.endO ) 

cout « "value:" « ( '"it ). second 

« " found in nm at location 6" « endl; 
for(it = nm.beginO; it != nm.endl); it + +) 

cout « (*it) .first « ":" 
« C-it) .second « ", "; 

} ///:- 

The set<Noisy> objecl ns is crealed using two iterators into an array of Noisy objects, but 
there is also a default constructor and a copy-constructor, and you can pass in an object that 
provides an alternate scheme for doing comparisons. Both sets and maps have an iiisert( ) 
member function to put things in, and there are a couple of different ways to check to see if an 
object is already in an associative container: connl( ), when given a key, will tell you how 
many times that key occurs (this can only be zero or one in a set or map, but it can be more 
than one with a multiset or multimap). The find( ) member function will produce an iterator 
indicating the first occurrence (with set and map, the only occurrence) of the key that you 
give it, or the past-the-end iterator if it can't find the key. The count( ) and flnd( ) member 
functions exist for all the associative containers, which makes sense. The associative 
containers also have member functions lower_boand( ), upper_bound( ) and 
equal_ninge( ), which actually only make sense for multiset and multimap, as you shall see 
(but don't try to figure out how they would be useful for set and map, since they are designed 
for dealing with a range of duplicate keys, which those containers don't allow). 

Designing an operator[ ] always produces a little bit of a dilemma because it's intended to be 
treated as an array -indexing operation, so people don't tend to think about performing a test 
before they use it. But what happens if you decide to mdex out of the bounds of the array? 
One option, of course, is to throw an exception, but with a map "indexing out of the array" 
could mean that you want an entry there, and that's the way the STL map treats it. The first 
for loop after the creation of the map<int, Noisy> nm just "looks up" objects using the 
operator[ ], but this is actually creating new Noisy objectsi The map creates a new key-value 
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pair (using tlie default constructor for the value) if you look up a value with operator[ ] and it 
isn't there. This means that if you really just want to look something up and not create a new 
entry, you must use count( ) (to see if it's there) or find( ) (to get an iterator to it). 

The for loop that prints out the values of the container using operator[ ] has a number of 
problems. First, it requires integral keys (which we happen to have in this case). Next and 
worse, if all the keys are not sequential, you'll end up counting from to the size of the 
container, and if there are some spots which don't have key-value pairs you'll automatically 
create them, and miss some of the higher values of the keys. Finally, if you look at the output 
from the for loop you'll see that things are very busy, and it's quite puzzling at first why there 
are so many constructions and destructions for what appears to be a simple lookup. The 
answer only becomes clear when you look at the code in the map template for operator[ ], 
which will be something like this: 

mapped_typea operator[] (const key_typeS k) { 
value_type tmp lk,T () ) ; 

return [* ( (insert (tmp) ) .first) ) .second; 
} 
Following the frail, you'll fmd that iiiap::value_type is: 

I typedef pair<con3t Key, T> value_type; 
Now you need to know what a pair is, which can be found in <utility>: 

template <cla3s Tl, class T2> 
struct pair ( 

typedef Tl first_type; 

typedef T2 second_type; 



pair (con 


St TlS 


X, 


const 


T2S 


V) 


: firs 


t (x) , 


ec 


ond(y) 


{] 




// Tempi 


atized 


CO 


py-con 


stru 


cto 


template 


<class 


u. 


class 


V> 




pair (CO 


nst pa 


r< 


U, V> 


ap) ; 





It turns out this is a very important (albeit simple) struct which is used quite a bit in the STL. 
All it really does it package together two objects, but it's very useful, especially when you 
want to return two objects ftwm a fiinction (since a return statement only takes one object). 
There's even a shorthand for creating a pair called iiiake_pair( ), which is used in 
AssociativeBasics.cpp. 

So to retrace the steps, map::value_type is a pair of the key and the value of the map - 
actually, it's a single entry for the map. But notice that pair packages its objects by value, 
which means that copy-constructions are necessary to get the objects into the pair. Thus, the 
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n of tmp in inap::operiitor[ ] will involve at least a copy-constructor call and 
destructor call for each object in the pair. Here, we're getting off easy because the key is a 
int. But if you want to really see what kind of activity can result from nmp::operator[ ], fr 
running this: 



//: C04:NoisyMap 


cpp 


// Mapping Noisy 


to Noisy 


#include "Noisy .h" 


#include <raap> 




using namespace 


td; 


int mainO { 




map<Noisy, Noi 


y> mnn; 


Noisy nl, n2 ; 








mnnlnl] = n2 ; 





} ///:- 

You'll see that both the insertion and lookup generate a lot of extra objects, and that's because 
of the creation of the tmp object. If you lookback up at iiiap::operator[ ] you'll see that the 
second line calls iiisert( ) passing it tmp - that is, operator[ ] does an insertion every time. 
The return value of insert() is a different kind of pair, where first is an iterator pointing to 
the key-value pair that was just inserted, and second is a bool indicating whether the 
insertion took place. You can see that operator[ ] grabs first (the iterator), dereferences it to 
produce the pair, and then returns the second which is the value at that location. 

So on the upside, map has this fancy "make a new entry if one isn't there" behavior, but the 
downside is that you always get a lot of extra object creations and destructions when you use 
map::operator[ ]. Fortunately, AssociativeBasics.cpp also demonstrates how to reduce the 
overhead of insertions and deletions, by not using operator[ ] if you don't have to. The 
insert( ) member function is slightly more efficient than operator[ ]. With a set you only hold 
one object, but with a map you hold key-value pairs, so insert( ) requires a pair as its 
argument. Here's where make_pair( ) comes in handy, as you can see. 

For looking objects up in a map. you can use count( ) to see whether a key is in the map, or 
you can use find( ) to produce an iterator pointing directly at the key-value pair. Again, since 
the map contains pairs that's what the iterator produces when you dereference it, so you have 
to select first and second. When you run Assf>ciativeBasics.cpp you'll notice that the iterator 
approach involves no extra object creations or destructions at all. It's not as easy to write or 
read, though. 

If you use a map with large, complex objects and discover there's too much overhead when 
doing lookups and insertions (don't assume this from the beginning - take the easy approach 
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first and use a profiler to discover bottlenecks), then you can use the counted -handle approach 
shown in Chapter XX so that you are only passing around small, lightweight objects. 

Of course, you can also iterate through a set ormap and operate on each of its objects. This 
will be demonstrated in later examples. 



Generators and fillers 
for associative containers 



y (11 ' V ( seeD lie w J sdii I Ike fill( ), fill_ii( ), generate( ) and generate_n( ) function templates 
in <algorittim> have been for filling the sequential containers (vector, list and deque) with 
data. However, these are implemented by using openitoi= to assign values into the sequential 
containers, and the way that you add objects to associative containers is with their respective 
iiisert( ) member functions. Thus the default "assignment" behavior causes a problem when 
trying to use the "fill" and "generate" functions with associative containers. 

One solution is to duplicate the "fill" and "generate" functions, creating new ones that can be 
used with associative containers. It turns out that only the flll_n( ) and generate_n( ) 
functions can be duplicated (fiU( ) and generate( ) copy in between two iterators, which 
doesn't make sense with associative containers), but the job is fairly easy, since you have the 
<algorithni> header file to work from (and since it contains templates, all the source code is 

//: C04:assocGen.h 

// The fill_n() and generate_n ( ) equivalents 

#ifndef ASSOCGEN_H 
#define ASSOCGEN_H 



assocFill n (A 


socs 


while 1 




0) 


1 


sert (val) ; 


template 


<clas 


Ass 


void ass 


ocGen 


n (As 


while ( 




0) 


a. in 


sert Ig 1) ) ; 



ndif // ASSOCGEN_H 
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You can see that instead of using iterators, tlie container class itself is passed (by reference, of 
course, since you wouldn't want to make a local copy, fill it, and then have it discarded at the 
end of the scope). 

This code demonstrates two valuable lessons. Tlie first lesson is that if the algorithms don't do 
what you want, copy the nearest thing and modify it. You have the example at hand in the 
STL header, so most of the work has already been done. 

The second lesson is more pointed: if you look long enough, there's probably a way to do it in 
the STL without inventing anything new. The present problem can instead be solved by using 
an iiisert_iterator (produced by a call to inserter( )), which calls iiisert( ) to place items in 
the container instead of operatoi^. This is not simply a variation of front_insert_iterator 
(produced by a call to front_iiiserter( )) or back_insert_iterator (produced by a call to 
back_iiiserter( )), since those iterators use push_front( ) and push_back( ), respectively. 
Each of the insert iterators is different by virtue of the member function it uses for insertion, 
and iiisert( ) is the one we need. Here's a demonstration that shows filling and generating 
both a map and a set (of course, it can also be used with multimap and multiset). First, some 
templatized, simple generators are created (this may seem like overkill, but you never know 
when you'll need them; for that reason they're placed in a header file): 



//: C04:Simple 


Generators 


h 




// Generic gen 


erators, i 


-icl 


iding 


// one that cr 


eates pair 






#include <iost 


ream> 






#include <iitil 


ity> 






// A generator 


that incr 


sme 


Its i 


template<typen 


arae T> 






class IncrGen 


{ 






T i; 








public: 








IncrGen (T ii 


) : i lii) 


{1 




T operator () 


1) { retur 


T i 


I-+; 1 



// A generator that produces an STL pair< 
template<typename Tl, typename T2> 
class PairGen | 

Tl i; 

T2 j; 
public: 

PairGen(Tl ii, T2 jj) : i(ii), j(jj) {] 

std: :pair<Tl,T2> operatorOl) { 

return std: :pair<Tl,T2> (i + + , j + +) ; 

} 
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); 

// A generic global operator« 
// for printing any STL pairO: 
template<typename Pair> std: :ostreamS 
operator<< (std: :ostreamS os, const Pairs p) { 

« p. second « std::endl; 
} ///:- 

Both generators expect that T can be incremented, and they simply use operator++ to 
generate new values from whatever you used for initialization. PairGen creates an STL pair 
object as its return value, and that's what can be placed into a map or multimap using 

iii«rt(). 

The last function is a generalization of operator« for ostreams. so that any pair can be 
printed, assuming each element of the pair supports a stream operator«. As you can see 
below, this allows the use of cop)'( ) to output the map: 

r so fill_nl) and 



// 


C04:AssocInserte 


// 


Us 


Log an insert_i 


// 


ge 


ierate_nl) can 


// 


CO 


itainers 


#i 


cl 


jde "SimpleGene 


#i 


cl 


jde <iterator> 


#i 


cl 


jde <iostream> 


#i 


cl 


jde <algorithm> 


#i 


cl 


jde <set> 


#i 


cl 


jde <map> 



ill_n(inserter (s, s.beginO), 10, 47); 
enerate_n (inserter (s, s. begin () ) , 10, 

IncrGen<int>(12) ) ; 
opy (s.beginO , s.endl). 



fill_n (inserter (m, m. begin!)), 10, 

make_pair (90,120) ) ; 
generate_n (inserter (m, m. begin () ) , 

PairGen<int, int> (3, 9) ) ; 

copy (m. begin 1) , m.endl). 



Chapter 15: Multiple Iiiliei 



I ( ///:- 

The second argument to inserter is an iterator, which actually isn't used in the case of 
associative containers since they maintain their order internally, rather than allowing you to 
tell them where the element should be inserted. However, an iiisert_iterator can be used with 
many different types of containers so you must provide the iterator. 

Note how the ostreain_iterator is created lo output a pair; this wouldn't have worked if the 
operator« hadn't been created, and since it's a template it is automatically instantiated for 
pair<int, iiit>. 



The magic of maps 



If f e , A map is an associative array, which means you associate one object with another in an 
array-like fashion, but instead of selecting an array element with a number as you do with an 
ordinary array, you look it up with an object! The example which follows counts the words in 
a text file, so the index is the string object representing the word, and the value being looked 
up is the object that keeps count of the strings. 



In a single-item contamer like a vector or list, there's only one thing being held. But in a 
map, you've got two things: the key (what you look up by, as in nKipiiame[key}) and the 
value that results from the lookup with the key. If you simply want to move through the entire 
map and list each key-value pair, you use an iterator, which when dereferenced produces a 
pair object containing both the key and the value. You access the members of a pair by 
selecting first or second. 

This same philosophy of packaging two items together is also used to insert elements into the 
map, but the pair is created as part of the instantiated map and is called value_type, 
containing the key and the value. So one option for inserting a new element is to create a 
value_type object, loading it with the appropriate objects and then calling the insert( ) 
member function for the map. Instead, the following example makes use of the 
aforementioned special feature of map: if you're trying to find an object by passing m a key 
to operator[ ] and that object doesn't exist, operator[ ] will automatically insert a new key- 
value pair for you, using the default constructor for the value object. With that in mind, 
consider an implementation of a word counting program: 



//: C0 4:WordCount.cpp 


//|L} 


StreamTokenizer 


// Co 


mt occurrences 


#incl 


jde "StreamToken 


#incl 


jde " . . /require. 


#incl 


jde <string> 


#incl 


ide <map> 


#incl 


:ide <iostream> 
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public: 

Count 1) : ilO) {) 
void operator++ (i 
intS val 1) { retu 



pedef 


map<string, Count> WordMap 


pedef 


WordMap: : iterator WMIter; 


t mai 


(int argc, char*" argv [ ] } { 


requi 


eArgs (argc, 1) ; 


ifstr 


am inlargvU]),- 


assur 


(in, argv[l] ) ; 


StreamTokenizer words (in); 


WordMap wordmap; 



string word; 

while ((word = words . next ()). s i ze 

wordmap [ word] ++ ; 
for (WMIter w = wordmap . begin () ; 
w >.= wordmap. end 0; w++) 
cout « (*w) .first « '■: " 

« C-w) .second. val 1) « endl 
} ///:- 

The need for the Count class is to contain an int that's 

is necessary because of the crucial line: 

wordmap [ word] ++ ; 

This finds the word that has been produced by StreamTokenizer and increments the Count 
object associated with that word, which is fine as long as there is a key-value pair for that 
string. If there isn't, the map automatically inserts a key for the word you're looking up, and 
a Conn! object, which is initialized to zero by the default constructor. Thus, when it' s 
incremented the Count becomes 1. 

Printing the entire list requires traversing it with an iterator (there's no copy( ) shortcut for a 
map unless you want to write an operator« for the pair in the map). As previously 
mentioned, dereferencing this iterator produces a pair object, with the first member the key 
and the second member the value. In this case second is a Count object, so its val( ) member 

must be called to produce the actual word count. 
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It for a particular word, you can use the array index operat 



rdmap ["the"] . val ( ) 



You can see that one of Ihe great advantages of the map is the clarity of the syntax; an 
associative array makes intuitive sense to the reader (note, however, that if "the" isn't already 
in the wordmap a new entry will be created !). 



A command -line argument tool 



you can specify on the command line. Usually you'd like 10 have a set of defaults that can be 
changed via the command line. The following tool expects the command line arguments to he 
in the form flagl=valuel with no spaces around the '=' (so it will be treated as a single 
argument). The ProgVal class simply inherits from niap<string, string>: 

// : C04 :ProgVals .h 

// Program values can be changed by command line 

#ifndef PROGVALS_H 

#define PROGVALS_H 

#include <map> 

#include <iostream> 

#incli]de <string> 

: public std :: map<std :: string, std::string> { 
public: 

ProqVals (std: [String def aults [ ] [ 2 ] , int sz); 
void par3e(int argc, char* argvf], 

std: [String usage, int offset = 1); 
void print (std : : ost reams out = std : : cout ) ; 
1; 
#endif // PROGVALS_H / / / : - 

The constructor expects an array of string pairs (as you'll see, this allows you to initialize it 
with an array of char*) and the size of that array. The parse( ) member function is handed the 
command-line arguments along with a "usage" string to print if the command line is given 
incorrectly, and the "offset" which tells it which command-line argument to start with (so you 
can have non-flag arguments at the beginning of the command line). Finally, priiit() displays 
the values. Here is the implementation: 

// : C04 : ProgVal s . cpp {0} 
#include "ProgVals.h" 



ogVals : :ProgVals 



Chapter 15: Multiple Iiiliei 



std::string def aults [ ] [ 2 ] , int sz) { 
for lint i = 0; i < sz; i + +) 
insert (make_pair ( 

defaults [i] [0], defaults [i] [1] ) ) ; 



oid ProgVals : :parse (int argc, char* argv [ ] , 

// Parse and apply additional 

// command-line arguments: 

for (int i = offset; i < argc; i + +) { 

string flag(argv[i] ); 

int equal = f lag . find (' = ') ; 

if (equal == string :: npos ) | 

argv[i] « endl « usage « endl; 
continue; // Next argument 

1 

string name = f lag . substr ( , equal); 
string value = flag . substr ( equal + 1); 
if (findlname) == endl) > { 

cerr « name « endl « usage « endl; 

continue; // Next argument 



operator[] (name) = value 



void ProgVals : :print (ostreamS out) ( 
out « "Program values:" « endl; 
for(iterator it = begin () ; it != endl); it + +) 
out « (*it) .first « " = " 
« I'-it) .second « endl; 
} ///:- 

The constructor uses the STL nmke_pair() helper function to convert each pair of char* into 
a pairobject that can be inserted into the map. In parse(), each command-line argument is 
checked for the existence of the telltale '=' sign (reporting an error if it isn't there), and then 
is broken into two strings, the name which appears before the '=', and the value which 
appears after. The operator[ ] is then used to change the existing value to the new one. 

Here's an example to test the tool: 

I // : CO 4 :ProgValTest.cpp 
I //{LI ProgVals 
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JUnclude "ProgVals.h" 
using namespace std; 



tring defaults!] [2] = { 
I "color", "red" }, 



I "shape", "rectangular" [, 

{ "action", "hopping"] , 



const char*- usage = "usage:\n" 
"ProgValTest [flagl=vall flag2=val2 . . . ] \n" 
"(Note no space around '=')\n" 
"Where the flags can be any of: \n" 

// So it can be used globally: 
ProgVals pvals ( defaults , 

sizeof defaults / sizeof ^defaults); 



class Animal { 

public: 

Animal (string col, string sz, 
string shp, string act) 

:color (col), size (sz), shape (shp), action (act) {] 
// Default constructor uses program default 
// values, possibly change on command line: 
AnimalO : color (pvals [ "color "]) , 

action (pvals ["action"] ) {] 
void print ! 



ndl 






ndl 



// And of course pvals can be used anywhe 
// else you'd lil^e. 



// Initialize and parse command line value 
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// befor. 
pvals.pa 
pvals.pr 



} ///:- 



This program 
characteristics 



ate Animal objects with different characteristics, and those 
established with the command line. The default characteristics are given 
array of char* called defaults and, after the usage string you can see 

a global instance of ProgVals called pvals is created; this is important because it allows the 

rest of the code in the program to access the values. 

Note that Animal's default 
initializer list. When you ru 

Many command-line programs also use a style of beginning a flag with a hyphei 
;s they use single -character flags. 



( the values in pvals inside its constructor 
the program you can try creating different animal characteristics. 



The STL map is used it 



■s throughout the rest of this book. 



Multimaps and duplicate keys 

A multimap is a map that can contain duplicate keys. At first this may seem like a str 
idea, but it can occur surprisingly often. A phone book, for example, can have many e 
with the same name. 



Suppose you are monitoring wildlife, and you wan 
type of animal is spotted. Thus, you may see many 
locations and at different times. So if the type of ai 
Here's what it looks like: 



o keep track of where and when each 
limalsof the same kind, all in different 
nal is the key, you'll need a multimap. 



// 



C04 :WildLifeMonit 
elude <vector> 
elude <map> 

elude <algorithm> 
elude <iostream> 
elude <sstream> 
elude <etime> 
nq namespaee std; 



r.cpp 
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DataPointO : 5.(0), y(0), timelO) {} 
DataPoint(int xx, int yy, time_t tm) 

x(xx), y(yy), time(tm) {] 
// Synthesized operator-, copy-constr 
int getXO | return x; } 
int getYO | return y; } 
tiine_t* getTimeO { return Stime; 1 



ring animal [ ] = | 
"chipmunk" , "beave 
"squirrel", "ptarm 
"hawk", "vole", "d 



eof animal/sizeof 
als (animal, animal 



// All the information is c. 
// "Sighting," which can be 
typedef pair<string, DataPo 



ined in a 
Sighting; 



Sightings s) 
ighted at x= " 



.getYO 

me 1) ) ; 



// A generator for Sightings: 
class SightingGen { 

vector<string>S animals; 
static const int d = 100; 
public: 

SightingGen (vector<string>fi an) : 
animals(an) { srand ( time ( ) ) ; 1 
Sighting operator () () { 
Sighting result; 

int select = rand ( ) % animal s . si ze ( ) 
result. first = animals [ select] ; 
result. second = OataPoint ( 

randl) % d, rand ( ) % d, time(O)); 
return result; 



Chapter 15: Multiple Iiiliei 



typedef multimap<str ing, DataPoint> Dat, 
typedef DataMap :: iterator DMIter; 

int mainl) { 

DataMap sightings ; 

inserter (sightings, sigh tings. begin 

50, SightingGen (animals) ) ; 
// Print everything: 
copy (sightings .begin (), sigh tings. end 

ostream_iterator<Sighting> (cout, "" 
// Print sightings for selected anima 
while (true) | 

cout « "select an animal or 'q' to 

for(int i = 0; i < animal s . s i ze () ; 
cout «•[•« i «•]'« animals[i] 

cout « endl; 

string reply; 

cin » reply; 

if (reply. at (0) == 'q') return 0; 

istringstream r (reply); 

r » i; // Converts to int 

i %= animals. sizeO; 

// Iterators in "range" denote begi. 

// past end of matching range: 

pair<DMIter, DMIter> range = 

sightings .equal_range (animals [i] ) 

ostream_iterator<Sighting> (cout. 



All the data about a sighting is encapsulated into the class DalaPoint, which is simple enough 
that it can rely on the synthesized assignment and copy-constructor. It uses the Standard C 
library time functions to record the time of the sighting. 

In the array of string animal, notice that the char*" constructor is automatically used during 
initialization, which makes initializing an array of string quite convenient. Since it's easier to 
use the animal names in a vector, the length of the array is calculated and a vector<string> is 
initialized using the vector (iterator, iterator) ci 
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The key-value pairs that make up a Sighting are the string which names the type of animal, 
and the DataPoint that says where and when it was sighted. The standard pair template 
combines tliese two types and is typedefed to produce tlie Sighting type. Then an ostream 
operator« is created for Sighting; this will allow you to iterate through a map or multimap 
of Sightings and print it out. 

SightingGen generates random sightings at random data points to use for testing. It has the 
usual operator( ) necessary for a fiinction object, but it also has a constructor to capture and 
store a reference to a vector<string>, which is where the aforementioned animal names are 
stored. 

A DalaMap is a multimap of string-Data Point pairs, which means it stores Sightings. It is 
filled with 50 Sightings using generate_n( ), and printed out (notice that because there is an 
operator« that takes a Sighting, an ostream_iterator can be created). At this point the user 
is asked to select the animal that they want to see all the sightings for. If you press 'q' the 
program will quit, but if you select an animal number, then the equal_ninge( ) member 
function is invoked. This returns an iterator (DMIter) to the beginning of the set of matching 
pairs, and one indicating past-the-end of the set. Since only one object can be returned from a 
function, equal_range() makes use of pair. Since the range pair has the beginning and 
ending iterators of the matching set, those iterators can be used in copy() to piint out all the 
sightings for a particular type of animal. 



Multisets 



Yoi'if seen lie set, which only allows one object of each value lo be inserted. The multiset 
is odd by comparison since it allows more thanoneobject of each value to be inserted. This 
seems to go against the whole idea of "setness," where you can ask "is 'it' in this set?" If 
there can be more than one of 'it', then what does that question mean? 

With some thought, you can see that it makes no sense to have more than one object of the 
same value in a set if those duplicate objects are exactly the same (with the possible exception 
of counting occurrencesof objects, but as seen earlier in this chapter that can be handled in an 
alternative, more elegant fashion). Thus each duplicate object will have something that makes 
it unique from the other duplicates — most likely different state information that is not used in 
the calculation of the value during the comparison. That is, to the comparison operation, the 
objects look the same but they actually contain some differing internal stale. 

Like any STL container that must order its elements, the moltiset template uses the less 
template by default to determine element ordering. This uses the contained classes' 
operator<, but you may of course substitute your own comparison function. 

Consider a simple class that contains one element that is used in the comparison, and another 
that is not: 

I // : CO 4 :MultiSetl . cpp 
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♦include <set> 
linclude <algorithm> 
linclude <ctime> 



class X { 

char c; // Used in comparison 

int i ; // Not used in comparison 

// Don't need default constructor and operator- 

XI); 

XS operator- (const XS ) ; 

// Usually need a copy-constructor (but the 

// synthesized version works here) 
public: 

X(char cc, int ii) : c (cc) , 1(11) {} 

// Notice no operator== is required 

friend bool operator< (const XS x, const XS y) | 
return x.c < y.c; 



class Xgen ! 

// Number of characters to select fr. 

public: 

Xgen () | srand(time (0) ) ; 1 
X operator () { 

char c = 'A' + rand ( ) % span; 
return X(c, i++) ; 



Xgen: : i = 0; 



typedef multiset<X> Xmsi 
typedef Xmset: :const_iti 
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// Fill it with X's: 

geiierate_n (inserter (mset, mset. begin () ), 

25, XgenO); 
// Initialize a regular set frora mset: 
3et<X> unique(m3et. begin 1) , mset . end ()) ; 
copy (unique. begin () , unique . end ( ) , 

03treain_iterator<X>(cout, " " ) ) ; 

cout « "Xn \n"; 

// Iterate over the unique values: 
for (set<X>: :iterator i = unique . begin () ; 
i != unique, end 0; i + +) { 

pair<Xmit, Xmit> p = mset . equal_range ( * i ) ; 

copy (p. first, p . second, 

ostream_iterator<X> (cout, " ") ) ; 

cout « endl; 
1 
( III-." 

In X, all the comparisons are made with the char c. The comparison is performed with 
operator<, which is all that is necessary for the multiset, since in this example the default 
less comparison object is used. The class Xgen is used to randomly generate X objects, but 
the comparison value is restricted to the span from 'A' to 'E'. In inaiD( ), a niultiset<X> is 
created and filled with 23 X objects using Xgen, guaranteeing that there will be duplicate 
keys. So that we know what the unique values are, a regular se(<X> is created from the 
multisel (using the iterator, iterator constructor). These values are displayed, then each one 
is used to produce the equal_range( ) in the multiset (equal_ninge( ) has the same meaning 
here as it does with multimap: all the elements with matching keys). Each set of matching 
keys is then printed. 

As a second example, a (possibly) more elegant version of WordCouiit.cpp can be created 
using multiset: 





C04:MultiS 


etWordC 


//{LI 


StreamTo 


kenizer 




Co 


jnt 


occur 


rences 




cl 


ide 


"StreamToken 




cl 


ide 


". ./r 


e quire. 




cl 


ide 


<3tri 


ng> 




cl 


ide 


<3et> 






cl 


ide 


<f3tr 


eam> 




cl 


ide 


<iter 


ator> 
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ifstream inlargv[l] ) ; 
assure (in, argv[l]>; 
StreamTokenizer words (in) ; 

while((word = words. next ) .size (} != 0) 

wordmset . insert (word) ; 
typedef multiset<string> :: iterator MSit; 
MSit it = wordmset. begin ; 
while(it != wordmset .end () ) { 

pair<MSit, MSit> p = wordmset . equal_range ( '"it ) ; 

int count = distance(p. first, p. second); 

cout « *it « ": " « count « endl ; 

it = p. second; // Move to the next word 
1 

The setup in main( ) is identical to WordCount.cpp, but then each word is simply inserted 
into the niBltiset<s(ring>. An iterator is created and initialized to the beginning of the 
multiset; dereferencing this iterator produces the current word. equal_range( ) produces the 
starting and ending iterators of the word that's currently selected, and the STL algorithm 
distance( ) (which is in <iterator>) is used to count the number of elements in that range. 
Then the iterator it is moved forward to the end of the range, which puts it at the next word. 
Although if you're unfamiliar with the multiset this code can seem more complex, the density 
of it and the lack of need for supporting classes like Count has a lot of appeal. 

In the end, is this really a "set," or should it be called something else? An alternative is the 
generic "bag" that has been defined in some container libraries, since a bag holds anything at 
all without discrimination - including duplicate objects. This is close, but it doesn't quite fit 
since a bag has no specification about how elements should be ordered, while a multiset 
(which requires that all duplicate elements be adjacent to each other) is even more restrictive 
than the concept of a set, which could use a hashing function to order its elements, in which 
case they would not be in sorted order. Besides, if you wanted to store a bunch of objects 
without any special criterions, you'd probably just use a vector, deque or list. 

Combining STL containers 



tD I li li i II (multimapor multiset) are not appropriate. The solution is tocombi 
which is easily done using the STL. Here, we need a tool that turns out to be a powerful 
general concept, which is a map of vector: 

I // : CO 4 : Thesaurus . cpp 
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// A map of vectors 

#include <map> 

#include <vector> 

linclude <3tring> 

linclude <io3tream> 

linclude <algorithm> 

linclude <ctime> 



typedef map<striiig, vector<str ing> > Thesaurus; 
typedef pairOtring, vector<str ing> > TEntry; 
typedef Thesaurus :: iterator Titer; 

ostreamS operator<< (ostreamS os, const TEntryfi t)| 
OS « t. first « ": "; 
copy (t. second. begin , t . second . end () , 



// A generator for thesaurus test entries: 
class ThesaurusGen { 

static const string letters; 

public: 

int maxSizel) { return letter s . s i ze () ; } 
ThesaurusGen { srand ( time ( ) ) ; } 
TEntry operator () () { 
TEntry result; 

if(count >= maxSizeO) count = 0; 
result. first = letters [ count++] ; 
int entries = (rand() % 5) + 2; 
for lint i = 0; i < entries; i + +) { 
int choice = randl) % maxSizel); 
char cbuf [2] = { 1; 
cbuf[0] = letters[choice] ; 
result. second. push_backl cbuf ) ; 
1 



ThesaurusGen: :count = 0; 
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const string ThesaurusGen :: letter s ( "ABCDEFGHIJKL" 
"MNOPQRSTUVWXYZabcdefghi jklmnopqrstuvwxyz" ) ; 

int mainl) { 

// Fill with 10 entries: 

inserter (thesaurus, thesaurus . begin ( ) ) , 
10, ThesaurusGen ) ; 
// Print everything: 

copy (thesaurus. begin () , thesaurus . end ( ) , 
03tream_iterator<TEntry> (cout, "\n") ) ; 
// Ask for a "word" to look up: 
while (true) { 

cout << "Select a \"word\", to quit: "; 
for (Titer it = thesaurus . begin () ; 
it != thesaurus. end 1 ) ; it + +) 
cout « C-it) .first « ' '; 
cout « endl; 
string reply; 
cin » reply; 

if (reply. at (0) == '0') return 0; // Quit 
if (thesaurus .find (reply) == thesaurus . end ( ) ) 

continue; // Not in list, try again 
vector<string>S v = thesaurus [ reply ] ; 
copy (v. begin , v. endl), 

ostream_iterator<string> (cout, " ") ) ; 
cout « endl; 
1 
( ///:- 

A Thesaurus maps a string (the word) to a vector<string> (the synonyms). A TEntry is a 
single entry in a Thesaurus. By creating an ostream operator« for a TEntry, a single entry 
from the Thesaaras can easily be printed (and the whole Thesaurus can easily be printed 
with copy( )). The ThesaurusGen creates "words" (which are just single letters) and 
"synonyms" for those words (which are just other randomly-chosen single letters) to be used 
as thesaurus entries. It randomly chooses the number of synonym entries to make, but there 
must be at least two. All the letters are chosen by indexing into a static string that is part of 
ThesaurusGen. 

In main( ), a Thesaurus is created, filled with 10 entries and printed using the copy( ) 
algorithm. Then the user is requested to choose a "word" to look up by typing the letter of that 
word. The flnd( ) member function is used to find whether the entry exists in the map 
(remember, you don't want to use operator[ ] or it will automatically make a new entry if it 
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doesn't find a match!). If so, operatoif ] is used to fetch out the vector<strmg> which h 
displayed. 

Because templates make the expression of powerful concepts easy, you can take this concept 
much further, creating a map of vectors containing maps, etc. For that matter, you can 

combine any of the STL containers this way. 

Cleaning up 

containers of pointers 

! I Stishape.cpp. the pointers did not clean themselves up automatically. It would be 
convenient to be able to do this easily, rather than writing out the code each lime. Here is a 
function template that will clean up the pointers in any sequence container; note that it is 
placed in the book's root directory for easy access: 

//: :purge.h 

// Delete pointers in an STL sequence container 
#ifndef PURGE_H 
#define PURGE_H 

#include <algor ithm> 

template<clas3 Seq> void purge (SeqS c) { 
typename Seq: : iterator i ; 

for(i = c.beginO; i != c . end ( ) ; i + +) { 
delete *i; 

'■i = 0; 



template<clas a Inplt> 

void purge (Inpit begin, Inpit end) | 
while (begin != end) { 

*begin = 0; 
begin++; 
} 
} 
#endif // PURGE_H / / / : - 

In the first version of purge( ), note that typename is absolutely necessary; indeed this is 
exactly the case that the keyword was added for: Seq is a template argument, and iterator is 
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something that is nested within that template. So what does Seq::iterator refer to? The 
typename keyword specifies that it refers to a type, and not something else. 



While the container version of purge must work with an STL -style 
version of pnrge() will work with any range, including an array. 

Here is Stlshape.cpp, modified to use the purge( ) function: 

// : CO 4 :Stlshape2 . cpp 

// Stlshape.cpp with the purge ( ) function 

#include ". . /purge. h" 

#include <vector> 

#include <io3tream> 
using namespace std; 

class Shape { 
public: 

virtual void draw ( ) = 0; 

virtual -Shape () { 1 ; 



class Circle : public Shape { 

public: 

void drawO I cout « "Circle :: draw\n" ; } 
-Circle!) { cout « "-CircleXn" ; } 



class Triangle : public Shape { 
public: 

void drawO { cout « "Tr iangle : : draw\n" ; 

-Triangle 1) { cout « "-Tr iangle\n" ; ) 



class Square : public Shape { 

public: 

void drawO ! cout « " Square :: draw\n" ; } 
-Square { cout « "-Square\n" ; } 

}; 



typedef std: : vector<Shape 
typedef Container :: iterat. 

int mainl) { 

Container shapes; 
shapes .push_back (new Ci 
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shapes . push_back (new Square ) ; 
shapes .push_back (new Triangle); 
for(Iter i = shapes . begin () ; 
i != shapes. endO; i + +) 
(*i)->draw(); 
purge (shapes) ; 
} ///:- 

When using purge( ), you must be careful to consider ownership issues - if an object pointer 
is held in more than one container, then you must be sure not to delete it twice, and you don't 
want to destroy the object in the first container before the second one is finished with it. 
Purging the same container twice is not a problem, because purge( ) sets the pointer to zero 
once it deletes that pointer, and calling delete for a zero pointer is a safe operation. 

Creating your own containers 

) iih lit ill n I Iggniiigg. ii's ptniMi It niiit )oir ti n i d d ini ui . A n v i hi ui 
|g Hot \U in i g g ill g ( p[gudigt itiii to [s. j g g i g 1 1 t g g li ig m > ill h h 1 1 n i( il i t[i i 
hilt-ii S 1 L tgghign. 

C giiidtr lit -[ig;' jili Utg t lg rt , i He li is i ( i[( g !n st ; g < i ' ^ ( g g ti ig 1 1. If i g g [m:I <h 
Eli, il jiii I i)[s i[ogid lg tli Wfigiig;. Hi; (ig U \a flta tiUi oi igp g t' i list as 
follows: 

//: C04:Ring.cpp 

// Making a "ring" data structure from the STL 
#include <iostream> 
#include <list> 
j #include <3tring> 
using namespace std; 

template<class T> 
class Ring { 

list<T> 1st; 
public: 

// Declaration necessary so the following 

// 'friend' statement sees this 'iterator' 

// instead of std :: iterator : 

class iterator; 

friend class iterator; 

class iterator : public std : : iterator< 

std: :bidirectional_iterator_tag,T,ptrdiff_t>! 
list<T>: : iterator it; 
list<T>* r; 



Chapter 15: Multiple Iiiliei 



// "typename" necessary to resolve nesting: 
iterator (list<T>S 1st, 

: r(Slst), it(i) {1 

bool operator = = (const iterators x) const { 



bool operator != (const i 
return ! ^-this == x) ; 









rfi operator++() { 



r->end() ) 
>begin () ; 



operator++(int) { 



eratorS operator— () { 
if (it == r->begin() ) 
it = r->end(); 



operator— (int) { 



t (const T6 X) { 



t(it, X) ) ; 
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oid push_back (const T& x) { 
lst.push_back (x) ; 



begin 1) { 

iterator list, Ist.beginO); 



Ring<string> rs; 






r3.pu3h_back("one"); 






r3.pu3h_back("two"); 






r3.pu3h_back(" three") ; 






r3 .pu3h_back("four") ; 






r3.pu3h_back("five") ; 






Ring<3tring>: [iterator 


it = rs 


.begin 0; 


it++; it++; 






it.in3ert ("six") ; 






it = rs.beginO ; 






// Twice around the ri 


ig: 




for(int i = 0; i < rs . 


izel) * 


2; i++) 



You can see that the iterator is where most of the coding is done. The Ring iterator must 
know how to loop bade to tlie beginning, so it must keep a reference to the list of its "parent" 
Ring object in order to know if it's at the end and how to get back to the beginning. 

You'll notice that the interface for Ring is quite limited; in particular there is no end{ ), since 
a ring just keeps looping. This means that you won't be able to use a Ring in any STL 
algorithms that require a past-the-end iterator - which is many of them. (It turns out thai 
adding this feature is a non-trivial exercise). Although this can seem limiting, consider stack, 
queue and priority_queue. which don't produce any iterators at all! 



Freely-available 
STL extensions 



t D I p li It . F H I M II F It . th i siiH I rd ill p It n 1 1 tilig i s g f set and map use trees, and although 
these are reasonably fast they may not be fast enough for your needs. In the C++ Standards 
Committee il was generally agreed that hashed implementations of set and map should have 
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been included in Standard C++, however tliere was not considered to be enough time to add 
these components, and thus they were left out. 

Fortunately, there are freely-available alternatives. One of the nice things about the STL is 
that it establishes a basic model for creating STL-like classes, so anything built using the 
same model is easy to understand if you are already familiar with the STL. 

The SGI STL (freely available at http://www.sgi.com/Technology/STL/) is one of the jnost 
robust implementations of the STL, and can be used to replace your compiler's STL if that is 
found wanting. In addition they've added a number of extensions including hash_set, 
hash.multiset. hash_map, hash_multimap, slist (a singly-linked list) and rope (a variant of 
string optimized for very large sfrings and fast concatenation and substring operations). 

Let's consider a performance comparison between a tree-based map and the SGI hasli_iiiap. 
To keep things simple, the mappings will be from int to int: 

// : CO 4 :MapVsHashMap.cpp 

// The hash_map header is not part of the 

// Standard C++ STL. It is an extension that 

// is only available as part of the SGI STL: 

#include <hash_map> 

#include <iostream> 

#include <map> 

using namespace std; 



ha3h_map<int, int> hm; 
map<int, int> m; 
clock t ticks = clock [) ■ 
for(int i = 0; i < 100; i++) 

for(int j = 0; j < 1000; j++) 
m. insert (make_pairlj,j)); 
cout « "map insertions: " 

« clockO - ticks « endl; 
ticks = clockl) ; 
for(int 1=0; i < 100; i++) 

fordnt j = 0; j < 1000; j + +) 
hm. insert (make_pair 1 j, j) ) ; 
cout « "hash_map insertions: " 

« clockO - ticks « endl; 
ticks = clockl); 
for(int 1=0; i < 100; i++) 

fordnt j = 0; j < 1000; j + +) 
m[j]; 
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cout << "map: : operator [] lookups: " 

« clockl) - ticks « endl; 
ticks = clock 0; 
for(int i = 0; i < 100; i++) 

fordnt j = 0; j < 10 0; j + +) 
hm [ j ] ; 
cout << "hash_map :: operator [ ] lookups: " 

« clockl) - ticks « endl; 
ticks = clock 0; 
for(int i = 0; i < 100; i++) 

for(int j = 0; j < 1000; j++) 
m.find(j); 
cout << "map: :find 1) lookups: " 

« clockl) - ticks « endl; 
ticks = clockl) ; 
for(int 1=0; i < 100; i++) 

fordnt j = 0; j < 1000; j + +) 
hm.findl j) ; 
cout << "hash_map: :find 1) lookups: " 

« clockl) - ticks « endl; 
} ///:- 

The performance test I ran showed a speed improvement of roughly 4:1 for the hash_inap 
over the map in all operations (and as expected, fiiid( ) is slightly faster than operalor[ ] for 
lookups for both types of map). If a profiler shows a bottleneck in your map. you should 
consider a hash_map. 



Summary 



Hi jnl tl tbii ihpin I n nt jisMg iil[Dl)ti llit S I I (nliiiM! ii ;ti i (ti 
Itptl jgl itiisi, III Mt[) dtliil !0iN h (tiMid iMt Jilt Ttii itoiN hw iii 

chplti h; I idc py pnp lit iDcidiHt m> n luiliHt ii llii S I L . iH sht d 
I II It liUtr iH I Dii dliiifiinni pio'ui I IDE iciiiiliM cii U hi min; ill 
mi iiitiHiDf llii S T I . 

lie lid tl Ml colli III isnpi fni iolig hcii; ion i ol th HI i i; i lilli i i ii I 
iintili ki) 1 ltd I tin III \i.\i th int clipiii ui'l! ;ii I i ml ii m fitgi 



Chapter J 5: Mulliple Iiiliei 



Exercises 



1 . Create a set<char>, then open a file (whose name is provided on the 
command line) and read that file in a char at a time, placing each char in 
the set. Print the results and observe the organization, and whether there are 
any letters in the alphabet that are not used in that particular file. 

2. Create a kind of "hangman" game. Create a class that contains a char and a 
bool to indicate whether that char has been guessed yet. Randomly select a 
word from a file, and read it into a vector of your new type. Repeatedly ask 
the user for a character guess, and after each guess display the characters in 
the word thai have been guessed, and underscores for the characters that 
haven't. Allow a way for the user to guess the whole word. Decrement a 
value for each guess, and if the user can get the whole word before the value 
goes to zero, they win. 

3. Modify WordCount.cpp so that it uses iiisert( ) instead of operator[ ] to 
insert elements in the map. 

4. Modify WordCount.cpp so that it uses a multimap instead of a map. 

5. Create a generator that produces random int values between and 20. Use 
this to fill a niultiset<int>. Count the occurrences of each value, following 
the example given in Multi Set WordCount.cpp. 

6. Change StIShape.cpp so that it uses a deque instead of a vector. 

7. Modify Reversible .cpp so it works with deque and list instead of vector. 

8. Modify Progvals.h and ProgVals.cpp so that they expect leading hyphens 
to distinguish command-line arguments. 

9. Create a second version of Progvals.h and ProgVals.cpp that uses a set 
instead of a map to manage single-character flags on the command line 
(such as -a -b -c etc) and also allows the characters to be ganged up behind 
a single hyphen (such as -abc). 

10. Use a slack<int> and build a Fibonacci sequence on the stack. The 
program's command line should take the number of Fibonacci elements 
desired, and you should have a loop that looks at the last two elements on 
the stack and pushes a new one for every pass through the loop. 

1 1 . Open a text file whose name is provided on the command line. Read the file 
a word at a time (hint: use ») and use a multiset<string> to create a word 
count for each word. 

12. Modify BankTeller.cpp so that the policy that decides when a teller is 
added or removed is encapsulated inside a class. 

13. Create two classes A and B (feel free to choose more interesting names). 
Create a multimap<A, B> and fill it with key-value pairs, ensuring that 
there are some duplicate keys. Use equal_range( ) to discover and print a 
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range of objects with duplicate keys. Note you may have to add some 

functions in A and/or B to make this program work. 

Perform the above exercise for a iiiultiset<A>. 

Create a class that has an operator< and an ostream& operator«. The 

class should contain a priority number. Create a generator for your class that 

makes a random priority number. Fill a priority_queue using your 

generator, then pull the elements out to show they are in the proper order. 

Rewrite RiDg^pp so it uses a deque instead of a list for its underlying 

implementation. 

Mollify Ring.cpp so that the underlying implementation can be chosen 

using a template argument (let that template argument default to list). 

Open a file and read it into a single string. Turn the string into a 

stringstream. Read tokens from the stringstream into a list<string> using 

a Tokenlterator. 

Compare the performance of stack based on whether it is implemented with 

vector, deque or list. 

Create an iterator class called BitBucket that just absorbs whatever you 

send to it without writing it anywhere. 

Create a template that implements a singly-linked list called SList. Provide 

a default constructor, begin( ) and end( ) functions (thus you must create 

the appropriate nested iterator), inserl( ), erase( ) and a destructor. 

(More challenging) Create a little command language. Each command can 

simply print its name and its arguments, but you may also want to make it 

perform other activities like run programs. The commands will be read from 

a file that you pass as an command-line argument, or from standard input if 

no file is given. Each command is on a single line, and lines beginning with 

'#' are comments. A line begins with the one-word command itself, 

followed by any number of arguments. Commands and arguments are 

separated by spaces. Use a map that maps string objects (the name of the 

command) to object pointers. The object pointers point to objects of a base 

class Command that has a virtual execute(string args) function, where 

args contains all the arguments for that command (execute( ) will parse its 

own arguments from a i^). Each different type of command is represented 

by a class that is inherited from Command. 

Add features to the above exercise so that you can have labels, if-then 

statements, and the ability to jump program execution to a label. 
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5: STL Algorithms 

The other half of the STL is the algorithms, which are 
tempiatized functions designed to work with the containers 
(or, as you will see, anything that can behave like a 
container, including arrays and string objects). 

The STL was originally designed around tlie algoritlims. Tlie goat was that you use algorithms 
for almost every piece of code that you write. In this sense it was a bit of an experiment, and 
only time will tell how well it works. The real test will be in how easy or difficult it is for the 
average programmer to adapt. At the end of this chapter you'll be able to decide for yourself 
whether you find the algorithms addictive or too confusing to remember. If you're like me, 
you'll resist them at first but then tend to use them more and more. 

Before you make your judgment, however, there's one other thing to consider. The STL 
algorithms provide a vocabulary with which to describe solutions. That is, once you become 
familiar with the algorithms you'll have a new set of words with which to discuss what you're 
doing, and these words are at a higher level than what you've had before. You don't have to 
say "this loop moves through and assigns from here to there ... oh, I see, it's copying!" 
Instead, you say copy(). This is the kind of thing we've been doing in computer 
programming from the beginning — creating more dense ways to express what we're doing 
and spending less time saying how we're doing it. Whether the STL algorithms and generic 
programming are a great success in accomplishing this remains to be seen, but that is 
certainly the objective. 

Function objects 

k 1 1 1 I I f I ih I I! I it 1 li I I lili II Ik ( S I i I Ij 1 [ill I 1 ii III t function object, which was 
introduced in the previous chapter. A fiinction object has an overloaded operalor( ), and the 
result is that a template function can't tell whether you've handed it a pointer to a function or 
an object that has an operator( ); all the template function knows is that it can attach an 
argument list to the object aj i/ it were a pointer to a function: 

// : C0 5:FuncOb ject.cpp 

// Simple function objects 



emplate<class UnaryFunc, class T> 
old callFunc(T& x, UnaryFunc f) { 
f (x) ; 



old g lints X) I 
X = 47; 



ruct UFunc { 

void operator () (intS x) { 



nt mainl) { 
int y = D, 
callFuncly, g ) ; 
cout « y « endl; 
y = 0; 
callFunc(y, UFunc {)} ; 



} ///:- 

The template callFunc( ) says "give me an f and an x, and I'll write the code f(x)." In main( ), 
you can see that it doesn't matter if f is a pointer to a function (as in the case of g()), or if it's 
a function object (which is created as a temporary object by the expression UFDnc( )). Notice 
you can only accomplish this genericity with a template liinction; a non-template function is 
too particular about its argument types to allow such a thing. The STL algorithms use this 
flexibility to take either a function pointer or a function object, but you'll usually find that 
creating a function object is more powerful and flexible. 

The function object is actually a variation on the theme of a callback, which is described in 
the design patterns chapter. A callback allows you to vary the behavior of a function or object 
bypassing, as an argument, a way to execute some other piece of code. Here, we are handing 
callFDnc( ) a pointer to a function or a function object. 

The following descriptions of function objects should not only make that topic clear, but also 
give you an introduction to the way the STL algorithms work. 

Classification of function objects 

1 j(( Is b isfd oil tb ( 1 u HI b ([ f irjii m en iMli It ilie ir operalor() takes and the kind of value 
returned by that operator (of course, this is also true for function pointers when you treat them 
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as function objects). The classification of function objec 
operator( ) takes zero, one or two arguments, and if it ri 



II the STL is based on whether the 
lis a bool or non-bool value. 



Generator: Takes no arguments, and returns a 
RandomNumberGenerator is a special case. 



ilue of the desired type. A 
of any type and returns a value which may be of a 



Una ryF unci ion: Takes a single . 
different type. 

Bina ryF unction: Takes two arguments of any two types and returns a value of any type. 

A special case of the unary and binary functions is the predicate, which simply means a 
function that returns a bool. A predicate is a function you use to make a true/false decisio 

Predicate: This can also be c 
and returns a bool. 



Llled a Unary Predicate, It takes a single argument of any type 



BinaryPredicate: Takes two arguments of any two types and returns a bool 

StrictWeakOrdering: A binary predicate that says that if you have twoobji 
one is less than the other, they can be regarded as equivalent to each other. 



In addition, there 

These qualifications are given i 

LessThan Comparable: Adas 

Assignable: A class thai has an 



qualifications on objei 
the template argumen 



t types that are passed t( 
type identifier name: 



that has a less-than operator<. 
assignment operator= for its own 
EqualityComparable: A class that has an equivalence operator: 



Automatic creation of function objects 

Tte STL hs, in lie li tide tilt <functional>, a set of templates that will automatically create 
function objects for you. These generated function objects are admittedly simple, but the goal 
is to provide very basic functionality that will allow you to compose more complicated 
function objects, and in many situations this is all you'll need. Also, you'll see that there are 
some function object adapters that allow you to take the simple function objects and make 
them slightly more complicated. 



Here are the templat 



i that generate function objects, along with the expressions that they 



Name 


Type 


Result produced by generated function 
object 


plus 


BinaryFunction 


argi +arg2 


mmus 


Binary Function 


argl-arg2 


multiplies 


BinaryFunction 


argl *- arg2 



Chapter 15: Multiple Iiiliei 



Name 


Type 


Result produced by generated function 
objecl 


divides 


Binary Function 


argl/arg2 


modulus 


BinaryFunction 


argl % arg2 


negate 


Unary Function 


-argl 


equal_to 


Binary Predicate 


argl == arg2 


not_equal_to 


Binary Pre die ate 


argl 1= arg2 


greater 


Binary Predicate 


argl > arg2 


les. 


BinaryPredicate 


argl < arg2 


greater_equal 


Binary Predicate 


argl >= arg2 


less_equal 


BinaryPredicate 


argl <= arg2 


logicaLand 


BinaryPredicate 


argl && arg2 


logical_or 


BinaryPredicate 


argl II arg2 


logical_not 


Unary Predicate 


!argl 


notl( ) 


Unary Logical 


! (UnaryPredicate(arg 1 » 


not2( ) 


Binary Logical 


!(BinaryPredicate(argl, arg2)) 



The following example provides simple i 
templates. This way, you can see how to 



;ts for each of the built-in basic function object 
« each one, along with their resulting behavior. 



// 


C05:FunctionObje 


// 


Using the predefi 


// 


in the Standard C 


// 


This will be defi 


#i 


iclude "Generators 


#i 


iclude <algorithm> 


#i 


iclude <vector> 


#i 


iclude <iostream> 


#i 


iclude <functional 
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copy (v. begin () , v.endl) , 




03tream_iterator<T> (co 


ut, " ") ) ; 


cout « endl; 




emplate<typename Contain, 


typename UnaryF 


oid testUnary (Contains so 


urce. Contains d 


UnaryFunc f) { 




transform (source. begin () 


source. endO, 


dest. begin () , f); 





empl a te< typename Con t a in 1 , typename 
typename BinaryFunO 

oid testBinary (ContainlS srcl, Conta 
Contain2s dest, BinaryFunc f) { 
transform(srcl.begin(), srcl.endO, 
src2. begin , dest . begin () , f); 



// Ex. 
// expn 



#def ine T (EXPR) EXPR; 

#define B (EXPR) EXPR; 



to the print st, 



print (r, "After " #EXPR) ; 
print (br, "After " #EXPR) ; 



struct BRand { 

BRandO ! srand (time ( ) ) ; ] 
bool operator () () { 

return rand ( ) > RAWD_MAX / 2; 



nt mainO { 
const int sz = 10; 
const int max = 50; 
vector<int> x(sz), y(sz), r(sz); 
// An integer random number gene 
URandGen urg (max) ; 

generate_n (y. begin , sz, urg); 
// Add one to each to guarantee 
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transfer 


mly. begin 1) , y . end ( ) , y.beginl). 


bind2n 


dlplus<int> 1) , 1) ) ; 


// Guara 


ntee one pair of elements is == : 


xlO] = y 


[0]; 


print (X, 


"x") ; 


print (y. 


"y"); 


// Opera 


te on each element pair of x S y. 


// putti 


ng the result into r: 


T(testBi 


nary(x, y, r, plus<int> ( ) ) ) ; 


T(testBi 


nary(x, y, r, minus<int> () ) ) ; 


T(testBi 


nary(x, y, r, multiplies<int> ( ) ) ) ; 


T(testBi 


nary(x, y, r, di vides<int> () ) ) ; 


T(testBi 


nary(x, y, r, modulus<int> () ) ) ; 


T(testUn 


ary(x, r, negate<int> ( ) ) ) ; 


vector<b 


ool> br(sz); // For Boolean results 


B(testBi 


nary(x, y, br, equal_to<int> ( ) ) ) ; 


B(testBi 


nary(x, y, br, not_equal_to<int> ( ) ) ) ; 


B(testBi 


nary(x, y, br, greater<int> ( ) ) ) ; 


B(testBi 


nary(x, y, br, less<int> ( ) ) ) ; 


B(testBi 


nary(x, y, br, greater_equal<int> ( ) ) ) ; 


B(testBi 


nary(x, y, br, less_equal<int> () ) ) ; 



B (testBinary (x, y, br, 

not2 (greater_equal<int> ()))); 
B (testBinary (x,y,br,not2 lless_equal<int> 1) ) ) ) ; 
vector<bool> bl(sz), b2(sz); 
generate_n (bl .begin () , sz, BRand () ) ; 
generate_n lb2. begin () , sz, BRand ( ) ) ; 
print (bl, "bl") ; 
print (b2, "b2") ; 

B (testBinary (bl, b2 , br, logical_and<int> ( ) ) ) ; 
B (testBinary (bl, b2 , br, logical_or<int> ( ) ) ) ; 
B(testUnary (bl, br, logical_not<int> ( ) ) ) ; 
B(testUnary (bl, br, notl (logical_not<int> ( ) ) ) ) ; 
} ///:- 

To keep this example small, some tools are created. The priiit( ) template is designed to print 
any vector<T>, along with an optional message. Since priiit( ) uses the STL copy( ) 
algorithm to send objects to cout via an ostreaiii_iterator, the ostreain_iterator must know 
the type of object it is printing, and therefore the print( ) template must know this type also. 
However, you'll see in inain( ) that the compiler can deduce the type of T when you hand it a 
vector<T>, so you don't have to hand it the template argument explicitly; you just say 
print(x) to print the vector<T> x. 
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The next two template functions automate the process of testing the various function object 
templates. There are two since the function objects are either unary or binary. In testUiiary( ), 
you pass a source and destination vector, and a unary function object to apply to the source 
vector to produce the destination vector. In lestBiiuiry( ), there are two source vectors which 
are fed to a binary function to produce the destination vector. In both cases, the template 
fiinctions simply turn around and call the transfonn( ) algorithm, although the tests could 
certainly be more complex. 

For each test, you want to see a string describing what the test is, followed by the results of 
the test. To automate this, the preprocessor conies in handy; the T( ) and B( ) macros each 
take the expression you want to execute. They call that expression, then call print( ), passing 
it the result vector (they assume the expression changes a vector named r and br, 
respectively), and to produce the message the expression is "string-ized" using the 
preprocessor. So that way you see the code of the expression that is executed followed by the 
result vector. 

The last little tool is a generator object that creates random bool values. To do this, it gets a 
random number from raiid( ) and tests to see if it's greater than RAND_MAX/2. If the 
random numbers are evenly distributed, this should happen half the time. 

In iiiain( ), three ¥ector<int> are created: x and y for source values, and r for results. To 
initialize x and y with random values no greater than 50, a generator of type URandGen is 
used; this will be defined shortly. Since there is one operation where elements of x are divided 
by elements of y, we must ensure that there are no zero values of y. This is accomplished 
using the tniiisfomi( ) algorithm, taking the source values from y and putting the results back 
into y. The function object for this is created with the expression: 

I bind2nd(plus<int> 1) , 1) 

This uses the plus function object that adds two objects together. It is thus a binary function 
which requires two arguments; we only want to pass it one argument (the element from y) and 
have the other argument be the value I . A "binder" does the trick (we will look at these next). 
The binder in this case says "make a new function object which is the plus function object 
with the second argument fixed at 1." 

Another of the tests in the program compares the elements in the two vectors for equality, so 
it is interesting to guarantee that at least one pair of elements is equivalent; in this case 
element zero is chosen. 

Once the two vectors are printed, T( ) is used to test each of the function objects that produces 
a numerical value, and then B{ ) is used to lest each function object that produces a Boolean 
result. The result is placed into a vector<bool>, and when this vector is printed it produces a 
'1' for a true value and a '0' for a false value. 

Binders 
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For example, suppose you want to find integers that are less tlian a particular value, say 20. 
Sensibly enough, the STL algorithms have a function called flnd_if( ) that will search through 
a sequence; however, find_if() requires a unary predicate to tell it if this is what you're 
looking for. This unary predicate can of course be some function object that you have written 
by hand, but it can also be created using the built-in function object templates. In this case, the 
less template will work, but that produces a binary predicate, so we need some way of 
forming a unary predicate. The binder templates (which work with any binary function object, 
not just binary predicates) give you two choices: 

bindlst(const Biiiar}'Function& op, const T& t); 
bind 2nd (const BinaryFuiiction& op, const T& t); 

Both bind t to one of the arguments of op, but bindlst() binds t to the first argument, and 
bind2nd( ) binds t to the second argument. With less, the function object that provides the 
solution to our exercise is: 



This produces a new function object that returns true if its argument is less than 20. Here it 
used with fiiid_if( ): 

// : C05:Binderl . cpp 
// Using STL "binders" 
#include "Generators .h" 
#include "copy_if.h" 
#include <algorithm> 
#include <vector> 
#include <iostream> 
#include <functional> 
using namespace std; 



URandGen urg(max); 

ostream_iterator<int> out (cout, " "); 
generate_n(a.begin(), sz, urg); 
copy(a.begin(), a.endl), out); 
int* d = find_if (a.beginO, a.endl), 

bind2nd(less<int>() , 20) ) ; 
cout « "\n *d = " « *d « endl; 
// copy_if() is not in the Standard C++ librar 
// but is defined later in the chapter: 
copy_if (a.begin 1) , a.endl), back_inserter (r ) , 

bind2ndlless<int> 1) , 20) ) ; 
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copy (r. begin , r . end ( ) , out); 
cout « endl; 
) III:- 

The vector<iiit> a is filled wilh random numbers between and max. find_if( ) finds the first 
element in a that satisfies the predicate (that is, which is less than 20) and returns an iterator to 
it (here, the type of the iterator is actually just int* although I could have been more precise 
and said vector<int>:: iterator instead). 

A more interesting algorithm to use iscopy_if(), which isn't part of the STL but is defined at 
the end of this chapter. This algorithm only copies an element from the source to the 
destination if that element satisfies a predicate. So the resulting vector will only contain 
elements that are less than 20. 

Here's a second example, using a vector<string> and replacing strings that satisfy particular 

// : C05:Binder2 . cpp 
// More binders 
#include <algorithm> 
#include <vector> 
#include <string> 
#include <io3tream> 
#include <functional> 
using namespace std; 

int mainl) { 

ostream_iterator<string> out (cout, " "); 

vector<string> v, r; 

v.push_back("Hi") ; 

v.push_back("Hi") ; 

v.push_back("Hey") ; 

v.push_back("Hee") ; 

v.push_back("Hi") ; 

copY(v.begin(), v. endl), out); 

cout « endl; 

// Replace each "Hi" with "Ho": 

rep 1 a ce_c op y_if (v. begin () , v.endl) , 

back_inserter (r) , 

bind2nd(equal_to<string> ( ) , "Hi" ) , "Ho" ) ; 
copY(r.begin(), r . end ( ) , out); 
cout « endl; 

// Replace anything that's not "Hi" with "Ho": 
replace_if (v.begin 1) , v.endl), 

notl lbind2nd ( equal_to<str ing> ( ) , "Hi" ) ) , "Ho" ) ; 
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copy (v. begin 1) , v.endl), out); 
cout « endl; 
} ///:- 

This uses another pair of STL algorithms. The first, replace_copj'_if( ), copies each element 
from a source range to a destination range, performing replacements on those that satisfy a 
particular unary predicate. The second, replace_if( ), doesn't do any copying but instead 
performs the replacements directly into the original range. 

A binder doesn't have to produce a unary predicate; it can also create a unary function (that is, 
a function that returns something other than bool). For example, suppose you'd like to 
multiply every element in a vector by 10. Using a binder with the transfomi( ) algorithm 
does the trick: 

// : C05:Binder3 . cpp 

linclude "Generators .h" 
linclude <algorithm> 
linclude <vector> 
linclude <io3tream> 
linclude <functional> 



int mainl) { 

ostream_iterator<int> out(cout, " "); 

vector<int> vll5); 

generate (v. begin () , v. end ( ) , URandGen (20) ) ; 

copy (v. begin , v.endl), out); 

cout « endl; 

tran3form(v. begin 1) , v.endl), v.beginl), 
bind2nd(multiplies<int> 1) , 10) ) ; 

copy (v. begin 1) , v.endl), out); 

cout « endl; 
} ///:- 

Since the third argument to transfonii( ) is the same as the first, the resulting elements ai 
copied back into the source vector. The function object created by bind2iid( ) in this casi 
produces an int result. 

The "bound" argument to a binder cannot be a function object, but it does not have to be 
compile -time constant. For example: 

// : C05:Binder4 .cpp 

// The bound argument does not have 

// to be a compile-time constant 

# include "copy_if . h" 
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include "PrintSequence.h" 

include ".. /require . h" 

include <iostream> 

include <algorithm> 

include <functional> 

include <c3tdlib> 
sing namespace std; 

nt boundedRand 1 ) { return rand ( ) % 100; 1 

nt main lint argc, char* argv [ ] ) { 
requireArgs large, 1, "usage: Binder4 int"}; 

int a[20], b(20] = {O}; 
generate(a, a + sz, boundedRand); 
int' end = copy_if (a, a + sz, b, 

bind2nd(greater<int>l), atoi largv[l] ) ) ); 
// Sort for easier viewing: 



Here, an array is filled with random numbers between and 100, and the user provides 
value on the command line. In the copy_if( ) call, you can see that the bound argument 
biiid2nd()is the result of the function call atoi() (from <cstdlib>). 



Function pointer adapters 



Any place in an STL algorilhiii where a function object is required, it's very conceivable thai 
you'd like to use a function pointer instead. Actually, you can use an ordinary function 
pointer - that's how the STL was designed, so that a "function object" can actually be 
anything that can be dereferenced using an argument list. For example, the rand( ) random 
number generator can be passed to generate( ) or generate_n( ) as a function pointer, like 
this: 

// : C0 5:RandGenTest.cpp 

// A little test of the random number generator 

#include <algorithm> 

#include <vector> 

#include <iostream> 

#include <functional> 

# include <cstdlib> 

#include <ctime> 
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mainl) { 
onst int sz = 10 0; 
nt v[sz] ; 

rand(time (0) ) ; // Seed the random genera 
or(int i = 0; i < 3 0; i++) { 

// Using a naked pointer to function: 

generate (v, v + sz, std::rand); 

int count = count_if(v, v + sz, 

bind2nd lgreater<int> () , RAND_MAX/2) ) ; 

cout « 1 1 (double) count) / ( (double) sz) ) 



The "iterators" in tliis case are just the starting and past-the-end pointers for the array v, and 
the generator is just a pointer to the standard library rand( ) function. The program repeatedly 
generates a group of random numbers, then it uses the STL algorithm count_if( ) and a 
predicate that tells whether a particular element is greater than RAND_MAX/2. The result is 
the number of elements that match this criterion; this is divided by the total number of 
elements and multiplied by 100 to produce the percentageof elements greater than the 
midpoint. If the random number generator is reasonable, this value should hover at around 
50% (of course, there are many other tests to determine if the random number generator is 
reasonable). 

The ptr_fun( ) adapters take a pointer to a function and turn it into a function object. They are 
not designed for a function that takes no arguments, like the one above (that is, a generator). 
Instead, they are for unary functions and binary functions. However, these could also be 
simply passed as if they were function objects, so the ptr_fnn() adapters might at first appear 
to be redundant. Here's an example where using ptr_fun( ) and simply passing the address of 
the function both produce the same results: 

// : C05:PtrFunl . cpp 

// Using ptr_fun() for single-argument functions 

linclude <algorithm> 

linclude <vector> 

#include <iostream> 

#include <functional> 

using namespace std; 

char* n(] = { "01.23", "91.370", "56.661", 
"023.230", "19.959", "1.0", "3.14159" }; 
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t (Inputlte 



int mainl) { 

print(n, n + nsz); 

vector<double> vd; 

transform (n, n + nsz, back_inserter ( vd) , atof); 

print(vd.begin(), vd . end ( ) ) ; 

transform(n,n + nsz , vd . begin () , ptr_f un ( atof ) ) ; 

print(vd.beginl), vd.endO); 
} ///:- 

The goal of this program is to convert an array of char* which are ASCII representations of 
floating-point numbers into a vector<double>. After defining this array and the print( ) 
template (which encapsulates the act of printing a range of elements), you can see 
tmnsfomi( ) used with atof( ) as a "naked" pointer to a function, and then a second time with 
atof passed to ptr_fun( ). The results are the same. So why bother with ptr_fun( )? Well, the 
actual effect of plr_fun( ) is to create a function object with an operator( ). This function 
object can then be passed to other template adapters, such as binders, to create new function 
objects. As you'll see a bit later, the SGI extensions to the STL contain a number of other 
function templates to enable this, but in the Standard C++ STL there are only the bindlst( ) 
and bind2iid( ) function templates, and these expect binary function objects as their first 
arguments. In the above example, only the ptr_fun( ) for a unary function is used, and that 
doesn't work with the binders. So ptr_fun( ) used with a unary function in Standard C++ 
really is redundant (note that Gnu g++ uses the SGI STL). 

With a binary function and a binder, things can be a little more interesting. This program 
produces the squares of the input vector d: 

// : C05:PtrFun2 . cpp 

// Using ptr_fiin 1) for two-argument functions 

linclude <algorithm> 

linclude <vector> 

linclude <iostream> 

linclude <functional> 

linclude <cmath> 



double d[] = { 01.23, 91.370, 56.661, 

023.230, 19.959, 1.0, 3.14159 1; 
const int dsz = sizeof d / sizeof *d; 
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int mainO | 

vector<doiible> vd; 

transform (d, d + dsz, back_inserter ( vd) , 

bind2nd(ptr_fun (pow) , 2.0)); 
copy ( vd . begin ( ) , vd . end ( ) , 

ostream_iterator<double> (cout, " ") ) ; 
cout « endl; 
} ///:- 

Here, ptr_fiin( ) is indispensable; bind2nd( ) mitsi have a function object as its first argument 
and a pointer (o function won't cut it. 

A trickier problem is that of converting a member function into a function object suitable for 
using in the STL algorithms. As a simple example, suppose we have the "shape" problem and 
would like to apply the draw( ) member function to each pointer in a container of Shape: 

// : C0 5:MemFunl . cpp 

// Applying pointers to member functions 

linclude ". . /purge. h" 

linclude <algoritiim> 

linclude <vector> 

linclude <iostream> 

linclude <functional> 

using namespace std; 

class Shape { 
public: 

virtual void drawl) = ; 

virtual -Shape {] 



class Circle : public Shape { 
public: 

virtual void drawl) { 

cout « "Circle: :Drawl) " « 
1 
-Circle I) { 

cout << "Circle: :~Circle 1 ) " 



class Square : public Shape { 
virtual void drawl) I 
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si) { 
<< "Squa 



vector<Shape'-> vs ; 
vs . push_back (new Circ 
vs . push_back (new Squa 
for_each(vs.beginl), 

mem_fun (S Shape: : dra 
purge (vs); 
} ///:- 



The for_each( ) function does just what it sounds like it does; passes each element in the 
range determined by the first two (iterator) arguments to the function object which is its third 
argument. In this case we want the function object to be created from one of the member 
functions of the class itself, and so the function object's "argument" becomes the pointer to 
the object that the member function is called for. To produce such a function object, the 
niein_fun( ) template takes a pointer to member as its argument. 

The nieni_fun( ) functions are for producing function objects that are called using a pointer to 
the object that the member function is called for, while mem_fun_ref( ) is used for calling the 
member flinction directly for an object. One set of overloads of both nieni_fun( ) and 
nieni_fun_ref( ) are for member functions that take zero arguments and one argument, and 
this is multiplied by two to handle const vs. non-const member functions. However, 
templates and overloading takes care of sorting all of that out; all you need to remember is 
when to use mem_fun( ) vs. mem_fun_ref( ). 



Suppose you have a container of objects (not pointers) and you want to call a member 
function that takes an argument. The argument you pass should come from a second 
of objects. To accomplish this, the second overloaded form of the transfonii( ) algorithi 
•A: 

II : C0 5:MemFun2 . cpp 

// Applying pointers to member functions 

#include <algorithm> 

#include <vector> 

#include <iostream> 

#include <functional> 

using namespace std; 
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Angle ( 



vector<Angle> va; 

for(int i = 0; i < 50; i += 10) 

va.push_back (Angle (1) ) ; 
int x[] = I 1, 2, 3, 4, 5 ] ; 
j tran3form(va.begin(), va . end ( ) , x, 
O3tream_iterator<int> (cout, " "} , 
mem_fun_ref (E Angle: :mul) ) ; 
cout « endl; 
} ///:- 

Because the container is holding objects, iiieiii_fuii_ref( ) must be used with the pointer-to- 
member function. This version of transfonii( ) takes the start and end point of the first range 
(where the objects live), the starting point of second range which holds the arguments to the 
member function, the destination iterator which in this case is standard output, and the 
function object to call for each object; this function object is created with nieni_fun_ref( ) 
and the desired pointer to member. Notice the traiisfomi( ) and for_each( ) template 
functions are incomplete; traiisfomi( ) requires that the function it calls return a value and 
there is no for_each( ) that passes two arguments to the function it calls. Thus, you cannot 
call a member function that returns void and takes an argument using transfomi( ) or 
for_each( ). 

Any member function works, including those in the Standard libraries. For example, suppose 
you'd like to read a file and search for blank lines; you can use the string:: empty ( ) member 
function like this: 

// : C0 5:FindBlanks . cpp 

// Demonstrate niem_fun_ref ( ) with str ing :: empty () 

#include ".. /require . h" 

#include <algorithm> 

#include <list> 

#include <string> 

#include <fstream> 

#include <functional> 

using namespace std; 

typedef list<string> :: iterator LSI; 
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LEI blank (LSI begin, LSI end) { 
return find_if (begin, end, 

mem_fun_ref (fistring: : empty) ) ; 



int main(int argc, char*" argv [ ] ) { 
requireArgs (argc, 1) ; 
ifstream in(argv[l] ) ; 
assuredn, argv[l]); 
list<string> Is; 
string s; 
while (getline (in, s) ) 

la .pu3h_back (s) ; 
LSI Isi = blank (Is .begin , Is.endO); 
while (Isi != Is.endO ) { 

*lsi = "A BLANK LINE"; 

Isi = blankdsi, Is.endO); 
1 
string f (argv[l] ) ; 

of stream out (f . c_str ( ) ) ; 
copYds.beginO, Is.endO, 

} ///:- 

The blank( ) function uses flnd_if( ) lo locate the first blank line in the given range using 
niein_fun_ref( ) with string:: empty ( ). After the file is opened and read into the list, blank( ) 
is called repeated times to find every blank line in the file. Notice that subsequent calls to 
blaiik( ) use the current version of the iterator so it moves forward to the next one. Each time 
a blank line is found, it is replaced with the characters "A BLANK LINE." All you have to do 
to accomplish this is dereference the iterator, and you select the current string. 



SGI extensions 



Tbe SG I STL (i cotiond illhe end o f tb e p rn io n s cli i|i tt r) iho includes id <l itio n il iiinnion 
objeclleipl]te!,«liicIi illo« foii lo I'rilf Minesiicnslliitemte even ncre conplicited 
funclion objecls.Coiisiilef) more iivolied progriig nlilcli con verts slringsofdijilsinto 
floilin; point gum bers, lile PtrFun2.cpp but more general. First, here's a generator that 
creates strings of integers that represent floating-point values (mcluding an embedded decimal 

I // : C0 5:NumStringGen.h 

// A random number generator that produces 

// strings representing floating-point numbers 
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#ifndef NUMSTRINGGEN_H 
Idefine NUMSTRINGGEN_H 
linclude <string> 
linclude <cstdlib> 

class NumStringGen { 

const int sz; // Number of digits to maki 
public: 

NumStringGen (int ssz = 5) : szlssz) { 

std: :srandlstd: :time (0) ) ; 
1 
std::string operator () () { 

static char n[] = "01234 567 8 9"; 
const int nsz = 10; 

for lint i = 0; i < sz; i + +) 
if (i == sz/2) 

r[i] = '.'; // Insert a decimal po 



#endif // NUMSTRINGGEN_H III:- 

You tell it how big the strings should be when you create the NumStringGen object. The 
random number generator is used to select digits, and a decimal point is placed in the middle. 

The following program (which works with the Standard C++ STL without the SGI 
extensions) uses NumStringGen to fill a vector<stiing>. However, to use the Standard C 
library function atof( ) to convert the strings to floating-point numbers, the string objects 
must first be turned into char pointers, since there is no automatic type conversion from 
string to char*. The traiisform() algorithm can be used with mem_fun_ref() and 
string: :c_str( ) to convert all the strings to char*, and then these can be transformed using 
atof: 

// : C0 5:MemFun3 . cpp 
// Using mem_fun () 
#include "NumStringGen . h" 
#include <algorithm> 
#include <vector> 
#include <string> 
#include <iostream> 
#include <functional> 
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.nt mainl) { 

vector<string> vslsz); 

// Fill it with random number strings: 
generate (vs .begin () , vs .end ( ) , NumStringGen ( ) ) ; 
copy (vs. begin 0, vs . end ( ) , 

O3tream_iterator<string> (cout, "\t") ) ; 
cout « endl; 
const char* vcp[sz] ; 
transform(vs.begin(), vs . end ( ) , vcp, 

vector<double> vd; 

transform (vcp, vcp + s z , back_inserter ( vd) , 

std: :atof ) ; 
copy (vd. begin , vd.endl), 

O3tream_iterator<double> (cout, "\t") ) ; 
cout « endl; 
} ///:- 

The SGI extensions to the STL contain a number of additional function object templates that 
accomplish more detailed activities than the Standard C-H- function object templates, 
including identity (returns its argument unchanged), projectlst and projectlnd (to take two 
arguments and return the first or second one, respectively), selectlst and selectlnd (to take a 
pair object and return the first or second element, respectively), and the "compose" function 
templates. 

If you're using the SGI extensions, you can make the above program denser using one of the 
two "compose" function templates. The first, coniposel(fl, f2), takes the two function objects 
fl and f2 as its arguments. It produces a function object that takes a single argument, passes it 
tof2, then takes the result of the call to f2 and passes it tofl. The result of fl is returned. By 
using composelO, the process of converting the string objects to char*, then converting the 
char* to a floating-point number can be combined into a single operation, like this: 

// : C0 5:MeraFun4 . cpp 

// Using the SGI STL composel function 

include "NumStringGen . h" 

include <algorithm> 

include <vector> 

include <iostream> 
include <functional> 
sing namespace std; 
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int mainl) { 

vector<3triiig> vs(sz); 

// Fill it with random number strings: 
generate(v3. begin 1) , vs . end () , NumStr ingGen () ) ; 
copy(vs.begin(), vs.endO, 

ostream_iterator<string> (cout, "\t") ) ; 
cout « endl; 
vector<double> vd; 
transform (vs .begin () , vs .end ( ) , back_inserter (vd) , 

composel (ptr_fun (atof ) , 

copy (vd. begin ( ) , vd . end ( ) , 

ostream_iterator<double>lcout, "\t") ) ; 
cout « endl; 
} ///:- 

You can see there's only a single call to transfomi( ) now, and no intermediate holder for Ihe 
char pointers. 

The second "compose" function is conipose2( ), which takes three function objects as its 
arguments. The first function object is binary (it takes two arguments), and its arguments are 
the results of the second and third function objects, respectively. The function object that 
results from conipf>se2( ) expects one argument, and it feeds that argument to the second and 
third function objects. Here is an example: 

// : CO 5: Compos e2 . cpp 

// Using the SGI STL compo3e2 () function 

#include "copy_if.h" 

linclude <algorithm> 

#include <vector> 

linclude <iostream> 

#include <functional> 

#include <cstdlib> 

using namespace std; 

int mainO { 

srand(timelO) ) ; 
vector<int> vllOO) ; 

generate(v. begin 1) , v. endl), rand) ; 
transform(v.begin(), v.endO, v.beginO, 
bind2nd(divides<int> () , RAND_MAX/100 ) ) ; 

copy_if (v.begin 1) , v.endO, back_inserter ( r ) , 
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compose2 ( logical_and<bool> ( ) , 

bind2nd lgreater_equal<int> ( ) , 30) , 
bind2nd(less_equal<int> () , 40) ) ) ; 
sort (r.beginO, r.endO); 
copy (r. begin , r . end ( ) , 

O3tream_iterator<int> (cout, " ") ) ; 
cout « endl; 
} III:- 

The vector<int> v is first filled with random numbers. To cut these down to size, the 
transfomi( ) algorithm is used to divide each value by RAND_MAX/100, which will force 
the values to be between and 100 (making them more readable). The copy_if( ) algorithm 
defined later in this chapter is then used, along with a composed function object, to copy all 
the elements that are greater than or equal to 30 and less than or equal to 40 into the 
destination vector<int> r. Just to show how easy it is, r is sorted, and then displayed. 

The arguments of conipose2( ) say, in effect: 

I (x >= 30) fifi (x <= 40) 

You could also take the function object that comes from a coiiiposeI( ) or compose2< ) call 
and pass it into another "compose" expression ... but this could rapidly get very difficult to 
decipher. 

Instead of all this composing and transforming, you can write your own function objects 
(without using the SGI extensions) as follows: 

// : C0 5:NoCompo3e.cpp 

// Writ:ing out the function objects explicitly 

#include "copy_if.h" 

linclude <algorithm> 

linclude <vector> 

linclude <string> 

#include <io3tream> 

linclude <functional> 

#include <cstdlib> 

#include <ctime> 



public: 

Rgen (int i 
srandlt 
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ax lRAND_MAX/mx) { 



-nib), toplt) {1 
g <= top) ; 



nl) { 

r<iiit> vllOO) ; 

3telv.beginl), 



nd 1 ) , Rgen ( ) ) ; 



copY_if (v. begin () , v. end ( ) , back_ii 

BoundTest (30, 4 ) ) ; 
3ort(r.begin(), r . end ( ) ) ; 
copy (r.beginO , r.endl). 



> (c 



ndl; 



There are a few more lines of code, but you can't deny that it's much clearer and easier to 
understand, and therefore to maintain. 

We can thus observe two drawbacks to the SGI extensions to the STL. The first is simply that 
it's an extension; yes, you can download and use them for free so the barriers to entry are low, 
but your company may be conservative and decide that if it's not in Standard C++, they don't 
want to use it. The second drawback is complexity. Once you get familiar and comfortable 
with the idea of composing complicated functions from simple ones you can visually parse 
complicated expressions and figure out what they mean. However, my guess is that most 
people will find anything more than what you can do with the Standard, non-extended STL 
fiinction object notation to be overwhelming. At some point on the complexity curve you have 
to bite the bullet and write a regular class to produce your function object, and that point 
might as well be the point where you can't use the Standard C++ STL. A stand-alone class for 
a function object is going to be much more readable and maintainable than a complicated 
fiinction -composition expression (although my sense of adventure does lure me into wanting 
to experiment more with the SGI e. 



As a final 
operator! ) requi 






't compose generators; you c: 
; or two arguments. 



miyc 



e function objec 



i whose 
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A catalog of STL algorithms 

This section provides a quick reference for wlien you're searcliing for tlie appropriale 
algorithm. I leave the full exploration of all the STL algorithms to other references (see the 
end of this chapter, and Appendix XX), along with the more intimate details of complexity, 
performance, etc. My goal here is for you to become rapidly comfortable and facile with the 
algorithms, and 1 will assume you will look into the more specialized references if you need 
more depth of detail. 

Although you will often see the algorithms described using their full template declaration 
syntax, 1 am not doing that here because you already know they are templates, and it's quite 
easy to see what the template arguments are from the function declarations. The type names 
for the arguments provide descriptions for the types of iterators required. I think you'll find 
this form is easier to read, while you can quickly find the full declaration in the template 
header file if for some reason you feel the need. 

The names of the iterator classes describe the iterator type they must conform to. The iterator 
types were described in the previous chapter, but here is a summary: 

Inputlterator. You (or rather, the STL algorithm and any algorithms you write that 
use Input Iterators) can increment this with openitor+-i- and dereference it with 
operator* to read the value (and only read the value), but you can only read each 
value once. Inputlterators can be tested with operator== and operator!=. That' s 
all. Because an Inputlterator is so limited, it can be used with istreams (via 
ist rea ni_i te rato r). 

Output Iterator. This can be incremented with operator++, and dereferenced with 
operator* to write the value (and oixly write the value), but you can only 
dereference/ write each value once. Outputlterators cannot be tested with 
operator== and operator!=, however, because you assume that you can Just keep 
sending elements to the destination and that you don't have to see if the destination' s 
end marker has been reached. That is, the container that an Outputlterator 
references can take an infinite number of objects, so no end-checking is necessary. 
This requirement is important so that an Outputlterator can be used with ostreams 
(via ostreani_iterator), but you'll also commonly use the "insert" iterators 
iiisert_iterator, front_iiisert_iterator and back_insert_iterator (generated by the 
helper templates inserter( ), front_inserter( ) and back_inserter( )). 

With both Inputlterator and Outputlterator, you cannot have multiple iterators 
pointing to different parts of the same range. Just think in terms of iterators to 
support istreams and ostreams, and Inputlterator and Outputlterator will make 
perfect sense. Also note that Inputlterator and Outputlterator put the weakest 
restrictions on the types of iterators they will accept, which means that you can use 
any "more sophisticated" type of iterator when you see Inputlterator or 
Ontpntlterator used as STL algorithm template arguments. 
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Forward! terator. Inputlterator and Outputlterator are the most restricted, which 
means they'll work with the largest number of actual iterators. However, there are 
some operations for which they are too restricted; you can only read from an 
iDputllerator and write to an Outputlterator, so you can't use them to read and 
modify a range, for example, and you can't have more than one active iterator on a 
particular range, or dereference such an iterator more than once. With a 
Forwardlterator these restrictions are relaxed; you can still only move forward 
using operator++, but you can both write and read and you can write/read multiple 
times in each location. A Forwardlterator is much more like a regular pointer, 
whereas Inputlterator and Outputlterator are a bit strange by comparison. 

Bidirectional Iterator. Effectively, this is a Forwardlterator that can also go 
backward. That is, a Bidirectionallterator supports all the operations that a 
Forwardlterator does, but in addition it has an operator—. 

RandomAccessIterator. An iterator that is random access supports all the same 
operations that a regular pointer does: you can add and subtract integral values to 
move it forward and backward by jumps (rather than just one element at a time), you 
can subscript it with openitor[ ], you can subfract one iterator from another, and 
iterators can be compared to see which is greater using operator<, operator>, etc. If 
you're implementing a sorting routine or something similar, random access iterators 
are necessary to be able to create an efficient algorithm. 

raes used for the template parameter types consist of the above iterator types 

s with a '1' or '2' appended to distinguish different template arguments), and may 
also include other arguments, often fiinction objects. 

When describing the group of elements that an operation is performed on, mathematical 
"range" notation will ofren be used. In this, the square bracket means "includes the end point" 
while the parenthesis means "does not include the end point." When using iterators, a range is 
determined by the iterator pointing to the initial element, and the "past-the-end" iterator, 
pointing past the last element. Since the past-the-end element is never used, the range 
determined by a pair of iterators can thus be expressed as [flrst, last), where first is the 
iterator pointing to the initial element and last is the past-the-end iterator. 

Most books and discussions of the STL algorithms arrange them according to side effects: 
non-mutating algorithms don't change the elements in the range, mutating algorithms do 
change the elements, etc. These descriptions are based more on the underlying behavior or 
implementation of the algorithm - that is, the designer's perspective. In practice, I don't find 
this a very usefiil categorization so I shall instead organize them according to the problem you 
want to solve: are you searching for an element or set of elements, performing an operation on 
each element, counting elements, replacing elements, etc. This should help you find the one 
you want more easily. 

Note that all the algorithms are in the namespace std. If you do not see a different header 
such as <utility> or <nunierics> above the function declarations, that means it appears in 
<algoritlini>. 
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Support tools for example creation 

It's useful to create some basic tools with which to test the algorithms. 

Displaying a range is something that will be done constantly, so here is a templatized function 
that allows you to print any sequence, regardless of the type that's in that sequence: 

// : C0 5:PrintSequence.h 

// Prints the contents of any sequence 

#ifndef PRINTSEQUENCE_H 

Idefine PRINTSEQUENCE_H 

template<typename Inputlter> 

void print (Inputlter first, Inputlter last, 

std: :ostreamfi os = std::coiit) { 

if(*nm != '\0') II Only if you provide a string 

OS « nm « ": " « sep; // is this printed 
while(first != last) 



// Use template-templates to allow type deduction 
//of the typename T: 

template<typename T, template<typename> class C> 
void print (C<T>fi c, char* nm = "", 

char* sep = "\n", 

std: :ostreamfi os = std::cout) { 

if(*nm != 'XO') // Only if you provide a string 
OS « nm « ": " « sep; // is this printed 

std: :copy(c.begin(), c.endO, 

std: :ostream_iterator<T> (os, " ") ) ; 

cout « endl; 
} 
#endif // PRINTSEQUENCE_H ///:- 

There are two forms here, one that requires you to give an explicit range (this allows you to 
print an array or a sub-sequence) and one that prints any of the STL containers, which 
provides notational convenience when printing the entire contents of that container. The 
second form performs template type deduction to determine the type of T so it can be used in 
the copy( ) algorithm. Tliat trick wouldn't work with the first form, so the copy{ ) algorithm is 
avoided and the copying is just done by hand {this could have been done with the second form 
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ee a temp I ate -temp I ate in use). Because of this, you never need 
to specify the type that you're printing when you call either template function. 

The default is to print to cout with newlines as separators, but you can change that. You may 
also provide a message to print at the head of the output. 

Next, it's useful to have some generators (classes with an operator( ) that returns values of 
the appropriate type) which allow a sequence to be rapidly filled with different values. 

// : CO 5 [Generators .h 

// Different ways to fill sequences 

#ifndef GENERATORS_H 

#define GENERATORS_H 

#include <3et> 

#include <cstdlib> 

#include <cstring> 

#include <ctime> 

// A generator that can skip over numbers: 
class SkipGen { 

int skp; 
public: 

SkipGen (int start = 0, int skip = 1) 

: i (start) , skp (skip) { } 
int operator 1) () { 

i += skp; 



// Generate unique random numbers from to mod: 
class URandGen { 

int modulus; 
public: 

URandGen (int mod) : modulus (mod) | 

std: :srandlstd: : time (0) ) ; 
1 

int operator 1) () { 
while (true) { 

int i = (int) std: :rand 1 ) % modulus; 
if (used.findli) == used.endl) ) { 
used. insert (i) ; 
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// Produces random characters: 
class CharGen { 

static const char*" source; 

static const int len; 
public: 

CharGenO { std : : srand ( std : : time ( ) ) ; 1 

char operator () () { 

return source [ std :: rand () % len] ; 

} 

); 

// statics created here for convenience, but 
// will cause problems if multiply included: 
const char* CharGen :: source = "ABCDEFGHI JK" 

"LMNOPQRSTUVWXYZabcdefghi jklmnopqrstuvwxyz"; 
const int CharGen: :len = std :: strlen (source ) ; 
#endif // GENERATORS_H ///:- 

To create some interesting values, the SkipGen generator skips by the value skp each time its 
operator( ) is called. You can initialize both the start value and the skip value in the 
constructor. 

URandGen ('U' for "unique") is a generator for random ints between and mod, with the 
additional constraint that each value can only be produced once (thus you must be careful not 
to use up all the values). This is easily accomplished with a set. 

CharGeD generates chars and can be used to fill up a string (when treating a string as a 
sequence container). You'll note that the one member function that any generator implements 
is operator( ) (with no arguments). This is what is called by the "generate" functions. 

The use of the generators and the print( ) functions is shown in the following section. 

Finally, a number of the STL algorithms that move elements of a sequence around distmguish 
between "stable" and "unstable" reordering of a sequence. This refers to preserving the 
original order of the elements for those elements that are equivalent but not identical. For 
example, consider a sequence { c(l), b(l), c(2), a(l), b(2), a(2) }. These elements are tested 
for equivalence based on their letters, but their numbers mdicate how they first appeared in 
the sequence. If you sort (for example) this sequence using an unstable sort, there's no 
guarantee of any particular order among equivalent letters, so you could end up with | a(2). 
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a(l), b(l), b(2), c(2), c(l) ). However, if you used a stable sort, it guarantees you will get { 
a(l),a(2),b(l),b(2),c(l),c(2)|. 

To demonstrate the stability versus instability of algorithms that reorder a sequence, we need 
some way to keep track of how the elements originally appeared. The following is a kind of 
string object that keeps track of the order in which that particular object originally appeared, 
using a static map that maps NStrings to Counters. Each NString then contains an 
occurrence field that indicates the order in which this NString was discovered: 



//: C05:NStrinq.h 




// A "numbered string 


th 


// occurrence this is 


of 


#ifndef NSTRING_H 




#define NSTRING_H 




#include <string> 




#include <inap> 




#include <iostream> 
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} 

// Need this for sorting. Notice it only 

// compares strings, not occurrences: 

friend bool 

operator< (const NStringS 1, const WStringS r) { 

] 

// For sorting with greater<NString> : 

friend bool 

operator> (const NStringS 1, const WStringS r) { 

1 

// To get at the string directly: 

operator const std :: strings ( ) const {return s;] 



// Allocate static member object. Done here for 

// brevity, but should actually be done in a 

// separate cpp file: 

NString: : csmap NString: :occurMap; 

#endif // NSTRING_H ///:- 

In the constructors (one that takes a string, one thai takes a char*), the simple-looking 
initialization fKciirrence(occurMap[s]++) performs all the work of maintaining and 
assigning the occurrence counts (see the demonstration of the map class in the previous 
chapter for more details). 

To do an ordinary ascending sort, the only operator that's necessary is 

NString:: ope rator<( ), however to sort in reverse order the operator>( ) is also provided sc 

that the greater template can be used. 

As this is just a demonstration class I am getting away with the convenience of putting the 
definition of the static member occurMap in the header file, but this will break down if the 
header file is included in more than one place, so you should normally relegate all static 
definitions to cpp files. 



Filling & generating 



tit vtn iDUodiftJ in tli i p rt vio ii s clif K r), 1 1 e Mill" 

tiei ES in Id tbc coitiiiier, w liilc lie 'jeiierite" 

lor (described earlier) to create the values to insert into 
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void rill(ForwardIterator first, Forward Iterator last, const T& value); 
void fdl_n(Ou (put Iterator flrst. Size n, const T& value); 

flll( ) assigns value lo every element in the range [flrst, last). fill_n( ) assigns value to n 
elements starting at flrst. 

void generatefForwardlterator first. Forward Iterator last. Generator gen); 
void generate_n(Outputlterator first. Size n. Generator gen); 

generate( ) makes a call to gen( ) for each element in the range [first, last), presumably to 
produce a different value for each element. generate_n( ) calls gen( ) n times and assigns 
each result to n elements starting at first. 

Example 

The follow in g esarapk fills and generates into vectors. It also shows the use of print(): 

// : C0 5:FillGeiierateTest.cpp 

// Demonstrates "fill" and "generate" 

#include "Generators .h" 

#include "PrintSequence . h" 

#include <vector> 

#include <algorithm> 

#include <string> 

using namespace std; 

int mainO { 

vector<string> vl (5) ; 

fill (vl. begin () , vl . end ( } , "howdy") ; 

print(vl, "vl", " "); 

fill_n(back_inserter lv2) , 7, "bye") ; 

print (v2 . begin ( ) , v2 . end ( ) , " v2 " ) ; 

vector<int> v3 (10) ; 

generate (v3 .begin () , v3 .end () , SkipGen (4,5)); 

print(v3, "v3", " " ) ; 



.e_n (back_inserter lv4) , 15, URandGen 1 3 ) ) ; 



} ///:- 



A vector<string> is created with a pre-defined size. Since storage has already been created 
for all the string objects in the vector, fill( ) can use its assignment operator to assign a copy 
of "howdy" to each space in the vector. To print the result, the second form of print( ) is used 
which simply needs a container (you don't have to give the first and last iterators). Also, the 
default newline separator is replaced with a space. 



Chapter J 5: Mulliple Iiiliei 



The second vector<striiig> v2 is not given an initial size so back_inserter must be used to 
force new elements in instead of trying lo assign to existing locations. Just as an example, the 
other priDt( ) is used which requires a range. 

The generate( ) and generate_n( ) functions have the same form as the "fill" functions except 
that they use a generator instead of a constant value; here, both generators are demonstrated. 



Counting 



A II con li in tri t IK i adtoJ size() that will tell you how many elements they hold. The 
following two algorithms count objects only if they satisfy certain criteria. 

Integra I Value count(Iiiputlterator first, In put Iterator last, 
const EqualityComparable& value); 

Produces the number of elements in [first, last) that are equivalent to value (when tested 
using operator==). 

IntegralValue count_if(lnputIte rater first, Inputlterator last. Predicate pred); 

Produces the number of elements in [first, last) which each cause pred to return true. 



Example 



Here, a vector<cliar> v is filled with random characters (including some duplicates). A 
set<cliar> is initialized from v, so it holds only one of each letter represented in v. This set i; 
used (o count all the instances of all the different characters, which are then displayed: 

// : CO 5: Counting. cpp 
// The counting algorithms 
#include "PrintSequence . h" 
#include "Generators .h" 

#include <algorithm> 
using namespace std; 

vector<char> v; 

generate_n lback_inserter (v) , 50, Char Gen () ) ; 

// Create a set of the characters in v: 
3et<char> cs (v . begin () , v.endO); 
set<char>: [iterator it = cs.beginl); 
while(it != cs.endO) ! 

int n = count(v.beginl), v.endl), -"it); 
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} 

int Ic = count_if (v.beginO, v . end ( ) , 

bind2nd (greater<char> ( ) , ' a ' ) ) ; 
cout « "XnLowercase letters: " « Ic « endl; 
sort (V.beginO, v.endO); 

} ///:- 

The count_if( ) algorithm is demonstrated by counting all the lowercase letters; the predic 
is created using the bind2nd( ) and greater function object templates. 



Manipulating sequences 



Output Iterator copy (I nputlte rater, first Inputlterator last, Outputlterator destination); 

Using assignment, copies from [flrst, last) to destination, incrementing destination after 
each assignment. Works with almost any type of source range and almost any kind of 
destination. Because assignment is used, you cannot directly insert elements into an empty 
container or at the end of a container, but instead you must wrap the destination iterator in an 
insertjterator (typically by using back_inserter( ), or inserter( ) in the case of an 
associative container). 

The copy algorithm is used in many examples in this book. 

Bidirectional Iterator 2 copy_backward|Bidirectionallteratorl llrst, 
Bidirectionallteratorl last, Bidirectional Iterator 2 destinationEnd); 

Like copy( ), but performs the actual copying of the elements in reverse order. That is, the 
resulting sequence is the same, it's just that the copy happens in a different way. The source 
range [first, last) is copied to the destination, but the first destination element is 
destinationEnd - 1. This iterator is then decremented after each assignment. The space in the 
destination range must already exist (to allow assignment), and the destination range cannot 
be within the source range. 

void reverse(Bidirectionallterator first, Bidirectional Iterator last); 
Outputlterator re verse_copy (Bidirectional Iterator first, Bidirectionallterator last, 
Outputlterator destination); 



Both forms of this function reverse the range [flrst, last). reverse( ) reverses the range in 
place, while reverse_copy( ) leaves the original range alone and copies the reversed elements 
into destination, returning the past-the-end iterator of the resulting range. 

Forwardlteratorl swap_ranges(ForivardIteratorl flrstl, Forwardlteratorl lastl, 
ForwardIterator2 flrstl); 
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Exchanges the contents of two ranges of equal size, by moving from the beginning to the end 
of each range and swapping each set of elements. 

void rotate (Forward Iterator flrst, Forwardlterator middle, Fomardlterator last); 
Outputlterator rota te.copyCForwardltera tor fli^t, Forwardlterator middle, 
Forwardlterator last, Outputlterator destination); 

Swaps Ihe two ranges [first, middle) and [middle, last). With rotate( ), the swap is 
performed in place, and with rotate_copy( ) the original range is untouched and the rotated 
version is copied into destination, returning the past-the-end iterator of the resulling range. 
Note that while swap_ranges( ) requires that the two ranges be exactly the same size, the 
"rotate" functions do not. 

bool next_permutation(B id irectional Iterator first. Bidirectional Iterator last); 
bool next_pemiutation(Bidirectional Iterator first. Bid irectional Iterator last, 

StrictWeakOrdering binarj'_pred); 
bool prev_permutation(Bidirectionallterator first. Bid irectional Iterator last); 
bool prev_peFmutation(Bidirectionallterator first. Bid irectional Iterator last, 

StrictWeakOrdering binary_pred); 

A permutation is one unique ordering of a set of elements. If you have n unique elements, 
then there are n! (n factorial) distinct possible combinations of those elements. All these 
combinations can be conceptually sorted into a sequence using a lexicographical ordering, and 
thus produce a concept of a "next" and "previous" permutation. Therefore, whatever the 
current ordering of elements in the range, there is a distinct "next" and "previous" 
permutation in the sequence of permutations. 

The next_pemiulation( ) and prev_pemiutation( ) functions re-arrange the elements into 
their next or previous permutation, and if successful return true. If there are no more "next" 
permutations, it means that the elements are in sorted order so next_permutation( ) returns 
false. If there are no more "previous" permutations, it means that the elements are in 
descending sorted order so previous_permutation( ) returns false. 

The versions of the functions which have a StrictWeakOrdering argument perform the 
comparisons using binary _pred instead of operator<. 

void random_shufi1e(RandoniAccesslterator fii^t, R a ndomAccess Iterator last); 
void random_shuffle(RandomAccesslterator first. Random Access Iterator last 
RandomNumberGenerator& rand); 

This function randomly rearranges Ihe elements in the range. It yields uniformly distributed 
results. The first form uses an internal random number generator and the second uses a user- 
supplied random-number generator. 

Bidirectional Iterator partition<BidirectionaIlterator first. Bidirectional Iterator last. 

Predicate pred); 
Bidirectional Iterator stable_partition(Bidirectionallterator first. 
Bidirectional Iterator last. Predicate pred); 
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The "partition" functions use pred to organize tlie elements in the range [first, last) so they 
are before or after the partition (a point in the range). The partition point is given by the 
returned iterator. If pred(*i) is true (where i is the iterator pointing to a particular element), 
then that element will be placed before the partition point, otherwise it will be placed after the 
partition point. 

With partitionO, the order of the elements is after the function call is not specified, but with 
stabie_parition( ) the relative order of the elements before and after the partition point will be 
the same as before the partitioning process. 

Example 

// : CO 5: Manipulations . cpp 
// Shows basic manipulations 
linclude "PrintSequence . h" 
linclude "NString.h" 
linclude "Generators .h" 
linclude <vector> 
linclude <string> 
linclude <algorithm> 
using namespace std; 

int mainl) { 

vector<int> vl (10) ; 
I // Simple counting: 

generate (vl. begin () , vl . end ( ) , SkipGen () ) ; 

vector<int> v2 ( vl . size () ) ; 

copy_backward(vl .begin () , vl .end ( ) , v2 .end ( ) ) ; 
print(v2, "copy_backward" , " ") ; 

reverse.copy (vl.beginO , vl . end () , v2. begin!)); 
print(v2, "reverse_copy " , " ") ; 
rever3e(vl.begin(), vl . end ( ) ) ; 
I print(vl, "reverse", " "); 
int half = vl.size() / 2; 

// Ranges must be exactly the same size: 
swap_ranges(vl.begin(), vl.beginl) +half, 

vl.beginO + half); 
print(vl, "3wap_ranges", " "); 
// Start with fresh sequence: 
generate (vl.beginO , vl.endl) , SkipGen 1) ) ; 

int third = vl.size 1) / 3; 
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for lint i = 0; i < 10; i + +) { 

rotate (vl .begin () , vl .begin ( ) + third, 

vl.endl) ) ; 
print (vl, "rotate", " ") ; 

} 

cout << "Second rotate example:" << endl; 

char c[] = "aabbccddeef f gghhi i j j " ; 

const char csz = strlen(c); 

for (int i = 0; i < 10; i + +) { 
rotate (c, c + 2, c + csz) ; 

1 

cout « "All n! permutations of abed:" « endl; 

int nf = 4'-3'-2'-l; 

char p[] = "abed"; 

for (int i = 0; i < nf; i + +) { 

next_permutation Ip, p + 4); 

print Ip, p + 4, "", "") ; 
1 

for lint i = 0; i < nf; i + +) { 

prev_permutation Ip, p + 4); 

print Ip, p + 4, "", "") ; 
1 
cout « "random_shuffling a word:" « endl; 

cout « s « endl; 

for (int i = 0; i < 5; i + +) { 

random_shuffle (s. begin () , s . end ( } ) ; 

cout « s « endl; 
1 
N3 1 r i n g s a [ ] = { " a " , " b " , " c " , " d " , " a " , " b " , 

vector<NString>: :iterator it = 
partition (ns. begin , ns.endl), 

bind2nd (greater<NString> ( ) , "b" ) ) ; 
cout << "Partition point: " << *it << endl; 

// Reload vector: 
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t = stable_partition (ns .begi. 
bind2nd(gr eater <NString> () , 



The best way to see the results of the above program is to run it (you'll probably want to 
redirect the output to a file). 

The vector<int> vl is initially loaded with a simple ascending sequence and printed. You'll 
see that the effect of copy_backward( ) (which copies into v2, which is the same size as vl) 
is the same as an ordinary copy. Again, copy_backward( ) does the same thing as copy( ), it 
just performs the operations in backward order. 

re¥erse_copy( ), however, actually does created a reversed copy, while reverse{ ) performs 
the reversal in place. Next, swap_ranges( ) swaps the upper half of the reversed sequence 
with the lower half. Of course, the ranges could be smaller subsets of the entire vector, as long 
as they are of equivalent size. 

After re-creating the ascending sequence, rotate( ) is demonstrated by rotating one third of vl 
multiple times. A second rofate( ) example uses characters and just rotates two characters at a 
time. This also demonstrates the flexibility of both the STL algorithms and the print( ) 
template, since they can both be used with arrays of char as easily as with anything else. 

To demonstrate next_pemiutation( ) and prev_pemiutatioi]( ), a set of four characters 
"abed" is permuted through all n! (n factorial) possible combinations. You'll see from the 
output that the permutations move through a strictly -defined order (that is, permuting is a 

deterministic process). 

A quick -and -dirty demonstration of randoni_shufl1e( ) is to apply it to a string and see what 
words result. Because a string object has begin( ) and eiid( ) member functions that return the 
appropriate iterators, it too may be easily used with many of the STL algorithms. Of course, 
an array of char could also have been used. 

Finally, the partition( ) and stable_partition( ) are demonstrated, using an array of NString. 
You'll note that the aggregate initialization expression uses char arrays, but NString has a 
char* constructor which is automatically used. 

When partitioning a sequence, you need a predicate which will determine whether the object 
belongs above or below the partition point. This takes a single argument and returns true (the 
object is above the partition point) or false (it isn't). I could have written a separate function 
or function object to do this, but for something simple, like "the object is greater than 'b'", 
why not use the built-in function object templates? The expression is: 

I bind2nd (greater <NString> ( ) , "b" ) 

And to understand it, you need to pick it apart from the middle outward. First, 

I great er<NString>() 
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;s a binary function object which compares its first and second arguments: 

jrns a bool. But we don't want a binary predicate, and we want to compare against the 
it value "b." So bind2nd( ) says: create a new function object which only takes one 
argument, by taking this greater<NString>( ) function and forcing the second argument to 
always be "b." The first argument (the only argument) will be the one irom the vector ns. 

You'll see from the output that with the unstable partition, the objects are correctly above and 
below the partition point, but in no particular order, whereas with the stable partition their 
original order is maintained. 



Searching & replacing 



Inputlterator find (Inpu (Iterator first, Inputlterator last, 
const E quality Co mparable& value); 

Searches for value within a range of elements. Returns an iterator in the range [first, last) that 
points to the first occurrence of value. If value isn't in the range, then flnd( ) returns last. 
This is a linear search, that is, it starts at the beginning and looks at each sequential element 
without making any assumptions about the way the elements are ordered. In contrast, a 
binary _search( ) (defined later) \\orks on a sorted sequence and can thus be much faster. 

Inputlterator rind_if(InputIterator first, Inputlterator last. Predicate pred); 

Just like find(), find_if() performs a linear search through the range. However, instead of 
searching for value, find_if( ) looks for an element such that the Predicate pred returns true 
when applied to that element. Returns last if no such element can be found . 

Forwardlterator adjacent_find(ForwardIterator first, Forivard Iterator last); 
Forwardlterator adjacent_find(ForwardIterator first, Forwardlterator last, 
Binary Predicate binary _pred); 

Like find( ), performs a linear search through the range, but instead of looking for only one 
element it searches for two elements that are right next to each other. The first form of the 
fiinction looks for two elements that are equivalent (via operator==). The second form looks 
for two adjacent elements that, when passed together to binary _pred, produce a true result. 
If two adjacent elements cannot be found, last is returned. 

Forwardlteratorl find_first_of(ForwardIteratorl firstl, Forwardlteratorl lastl, 

ForwardIlerator2 first2, Forwardlteratorl lastl); 
Forwardlteratorl rind_fit^t_of (Forwardlteratorl firstl, Forwardlteratorl lastl, 

Forwardlteratorl firstl, Forwardlteratorl lastl, Binary Predicate binary_pred); 
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Like flnd( ), performs a linear searcii through the range. The first form finds the first element 
in the first range that is equivalent to any of the elements in the second range. The second 
form finds the first element in the first range that produces true when passed lo biiiary_pred 
along with any of the elements in the second range. When a Binary Predicate is used with 
two ranges in the algorithms, the element from the first range becomes the first argument to 
binary_pred, and the element from the second range becomes the second argument. 

Forwardlteratorl searchfForwardlteratorl flrstl, Forwardlteratorl lastl, 

ForwardIterator2 first!, Forwardlteratorl lastl); 
Forwardlteratorl searchfForwardlteratorl firstl, Forwardlteratorl lastl, 

Forwardlteratorl firstl, Forwardlteratorl lastl Binary Predicate biiiary_pred); 

Attempts to find the entire range [flrstl, lastl) within the range [firstl, lastl). That is. it 
checks to see if the second range occurs (in the exact order of the second range) within the 
first range, and if so returns an iterator pointing to the place in the first range where the 
second range begins. Returns lastl if no subset can be found. The first form performs its test 
using operator==, while the second checks to see if each pair of objects being compared 
causes binary_pred to return true. 

Forwardlteratorl find_end(Forwardlteratorl flrstl, Forwardlteratorl lastl, 

Forwardlteratorl firstl, Forwardlteratorl lastl); 
Forwardlteratorl flnd_end(Forwardlteratorl flrstl, Forwardlteratorl lastl, 

Forwardlteratorl firstl, Forwardlteratorl lastl, Binary Predicate binary_pred); 

The forms and arguments are just like search( ) in that it looks for the second range within the 
first range, but while search( ) looks for the first occurrence of the second range, flnd_end( ) 
looks for the fas ( occurrence of the second range within the first. 

Forwardlterator search_n(Forwardlterator first, Forwardlterator last, 

Size count, const T& value); 
Forwardlterator search_n(Forwardlterator first, Forwardlterator last, 

Size count, const T& value. Binary Predicate binary _pred); 

Looks for a group of count consecutive values in [first, last) that are all equal to value (in the 
first form) or that all cause a return value of true when passed into binary _pred along with 
value (in the second form). Returns last if such a group cannot be found. 

Forwardlterator min_elenient(Forward Iterator first, Forwardlterator last); 
Forwardlterator min_elenient(Forwardlterator first, Forwardlterator last, 
Binary Predicate binary _pred); 

Returns an iterator pointing to the first occurrence of the smallest value in the range (there 
may be multiple occurrences of the smallest value). Returns last if the range is empty. The 
first version performs comparisons with operator< and the value r returned is such that 

is false for every element e in the range. The second version compares using biiiary_pred 
and the value r returned is such that binary_pred (*e, *r) is false for every element e in the 
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Forwnrdlterator ma x_eleineiit (Forward Iterator first, Forwardlterator last); 
Forwardlterator ma x_eleinent (Forward Iterator flrst, Forwardlterator last. 
Binary Predicate binary_pred); 

Returns an iterator pointing to the first occurrence of the largest value in the range (there may 
be multiple occurrences of the largest value). Returns last if the range is empty. The first 
version performs comparisons with operator< and the value r returned is such that 

is false for every element e in the range. The second version compares using binary_pred 
and the value r returned is such that binary_pred {*r, *e) is false for every element e in the 

void replace (Forward Iterator flret, Forwardlterator last, 

const T& old_ value, const T& new_value); 
void replace_if (Forwardlterator first, Forwardlterator last, 

Predicate pred, const T& new_ value); 
Outputlterator replace_copy(InputIteratoF first, Inputlterator last, 

Outputlterator result, const T& old_value, const T& new_value); 
Outputlterator replace_copy_if (Inputlterator flrst, Inputlterator last, 

Outputlterator result. Predicate pred, const T& new_ value); 

Each of the "replace" forms moves through the range [first, last), finding values that match a 
criterion and replacing them with new_value. Both replace( ) and replace_copy( ) simply 
look for old_valoe to replace, while replace_if( ) and replace_copy_if( ) look for values that 
satisfy the predicate pred. The "copy" versions of the functions do not modify the original 
range but instead make a copy with the replacements into result (incrementing result after 
each assignment). 

Example 

To provide easy viewing ofthe results, this example will manipulate vectors of int. Again, 
not every possible version of each algorithm will be shown (some that should be obvious have 
been omitted). 

// : C0 5:SearchReplace.cpp 

// The STL search and replace algorithms 

#include "PrintSequence . h" 

#include <algorithni> 

#include <functional> 
using namespace std; 

struct PlusOne { 

bool operator 1) lint i, int j) { 
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); 



class MulMoreThan { 
int value; 

public: 

MulMoreThan (int val ) : value (val) {) 
bool operator 1) (int v, int m) { 
return v * m > value; 



nt mainl) { 
int a[] = { 1, 2, 3, 4, 5, 6, 5, 7, 7, 7, 

8, 8, 8, 8, 11, 11, 11, 11, 11 1; 
const int asz = sizeof a / sizeof *a; 
vector<int> via, a + asz); 

find(v.begin(), v.endO, 4); 
cout « "find: " « *it « endl; 
it = find_if (v.beginO, v. endl), 

bind2nd(greater<int>(), 8)); 
cout « "find_if: " « *it « endl; 
it = adjacent_find(v.beginl), v. endl)); 
whilelit != V.endO ) { 

« ", " « *■ lit + 1) « endl; 

it = adjacent_findlit + 2, v.endD); 
1 
it = adjacent_findlv. begin 1) , v. endl), 

PlusOne 1) ) ; 
while lit != v.endl) ) { 

cout << "adjacent_find PlusOne: " << '"it 
« ", '■ « * lit + 1) « endl; 

it = adjacent_findlit + 1, v.endl), 
PlusOne 1 ) ) ; 
1 

int b[] = { 8, 11 1; 

const int bsz = sizeof b / sizeof *b; 
print lb, b + bsz, "b", " "); 
it = find_first_of Iv. begin 1) , v.endl), 

b, b + bsz) ; 
printlit, it + bsz, " f ind_f i r st_of " , " "); 



Chapter J 5: Mulliple Iiiliei 



it = find_first_of (v.begin 1) , v.endl), 

b, b + bsz, PlusOne () ) ; 
printlitjit + bsz , "f ind_f irst_of PlusOne"," " ) ; 
it = search (v.begin 1) , v.endl), b, b + bsz); 

int c[] = I 5, 6, 7 }; 

it = search(v.begin(), v.endl), 

c, c i- csz, PlusOne ( ) ) ; 
print(it, it + csz, "search PlusOne", " "); 
int d[] = I 11, 11, 11 1; 
const int dsz = sizeof d / sizeof '"d; 
print(d, d + dsz, "d", " " ) ; 

it = find_end Iv.begin I) , v.endl), d, d + dsz); 
printlit, v.endl) , "find_end", " "); 
int e[] = I 9, 9 }; 

it = find_end(v.begin I) , v.endl), 

e, e + 2, PlusOne () ) ; 
printlit, v . end (), "find_end PlusOne" , " "); 
it = search_n(v.begin(), v.endl), 3, 7); 
printlit, it + 3, "search_n 3, 7", " "); 
it = search_n Iv.begin I) , v.endl), 

6, 15, MulMoreThan |1D0) ) ; 
print lit, it + 6, 

"search_n 6, 15, MulMoreThan | 1 )" , " "); 
cout « "min_element: " « 

*min_element(v.begin|), v.endl)) « endl ; 
cout « "max_element: " « 

*max_element |v. begin 1) , v.endl)) « endl ; 
vector<int> v2 ; 
replace_copy(v.begin(), v.endl), 

back_in3erter (v2) , 8, 47); 
print(v2, "replace_copy 8 -> 47", " "); 
replace_if (v.beginO, v.endl), 

bind2nd|greater_equal<int> 1 ) , 7) , -1) ; 
printiv, "replace_if >= 7 -> -1", " "); 
} ///:- 

The example begins with two predicates: PlusOne which is a binary predicate that returns 
true if the second argument is equivalent to one plus the first argument, and MulMoreThan 
which returns true if the first argument times the second argument is greater than a value 
stored in the object. These binary predicates are used as tests in the example. 
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In maiii( ), an array a is created and fed to the constructor for vector<iiit> v. This vector will 
be used as the target for the search and replace activities, and you'll note that there are 
duplicate elements - these will be discovered by some of the search/replace routines. 

The first test demonstrates find( ), discovering the value 4 in v. The return value is the iterator 
pointing to the first instance of 4, or the end of the input range (v.end( )) if the search value is 

find_if( ) uses a predicate to determine if it has discovered the correct element. In the above 
example, this predicate is created on the fly using greater<int> (that is, "see if the first inl 
argument is greater than the second") and bind2nd( ) to fix the second argument to 8. Thus, it 
returns true if the value m v is greater than 8. 






a number of ci 



V where t> 



) identical objects appear ni 



o each other, 



looking from the beginning 
it has not reached the end of the 
be found). For each match it 
;xt adjacent_fiiid( ), this time 
wo elements that it already 



the test of adjacent_find( ) is designed to find them all. It : 
and then drops into a while loop, making sure that the iteratoi 
input sequence (which would mean that no more matches can 
finds, the loop prints out the matches and then performs the n 
using if + 2 as the first argument (this way, it moves past the i 
found). 

You might look at the while loop and think that you can do it 

whilelit != v.endl) ) { 



Of course, this is exactly what 1 tried at first. However, I did not get the output I expected, o 
any compiler. This is because there is no guarantee about when the increments occur in the 
above expression. A bit of a disturbing discovery, I know, but the situation is best avoided 



The next test uses adjacent_rind( ) with the PlusOne predicate, which disi 

places where the next number in the sequence v changes from the previous by one. The same 

while approach is used to find all the cases. 

find_first_of( ) requires a second range of objects for which to hunt; this is provided in the 
array b. Notice that, because the first range and the second range in find_first_of( ) are 
controlled by separate template arguments, those ranges can refer to two different types of 
containers, as seen here. The second form of find_first_of( ) is also tested, using PlusOne. 

search( ) finds exactly the second range inside the first one, with the elements in the same 
order. The second formof search( )uses a predicate, which is typically just something that 
defines equivalence, but it also opens some interesting possibilities - here, the PlusOne 
predicate causes the range { 4, 5, 6 } to be found. 
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The fiiid_end( ) test discovers the /o^/ occurrence of the enthe sequence {11, 11, 11}. To 
show that it has in fact found the last occurrence, the rest of v starting from it is printed. 

The first search_n( ) test looks for 3 copies of the value 7, which it finds and prints. When 
using the second version of search_n( ), the predicate is ordmarily meant to be used to 
determine equivalence between two elements, but I've taken some liberties and used a 
function object that multiplies the value in the sequence by (in this case) 1 3 and checks to see 
if it's greater than 100. That is, the search_n( ) test above says "find me 6 consecutive values 
which, when multiplied by 15, each produce a number greater than 100." Not exactly what 
you normally expect to do, but it might give you some ideas the next time you have an odd 
searching problem. 

iiiiD_elenient( ) and niax_eleinent( ) are straightforward; the only thing that's a bit odd is that 
it looks like the function is being dereferenced with a '*'. Actually, the returned iterator is 
being dereferenced to produce the value for printing. 

To test replacements, replace_copy( ) is used first (so it doesn't modify the original vector) to 
replace all values of 8 with the value 47. Notice the use of back_iiiserter( ) with the empty 
vector \2. To demonstrate replace_if( ), a function object is created using the standard 
template greater_equal along with bind2nd to replace all the values that are greater than or 
equal to 7 with the value -I. 



Comparing ranges 



Tlest iljotittms proiijt wijj lo (imi pire lao niijts. Aifi[s[{liiice,tlif cpfriliaiis llev 
perlorm sttm (e[j dost lo tlii search() function above. However, search( ) tells you where 
the second sequence appears within the first, while equal( ) and lexicographical_conipare( ) 
simply tell you whether or not two sequences are exactly identical (using different comparison 
algorithms). On the other hand, iiiisniatcli( ) does tell you where the two sequences go out of 
sync, but those sequences must be exactly the same length. 

bool equal(InputIteratorrirstl, InputlU ra tor last 1, Inputlterator flrst2); 
bool equal(InputIterator flrstl, Inputlterator lastl, Inputlterator first2 
Binary Predicate binary _pred); 

In both of these functions, the first range is the typical one, [firstl, lastl). The second range 
starts at first2, but there is no "last2" because its length is determined by the length of the first 
range. The equal( ) function returns true if both ranges are exactly the same (the same 
elements in the same order); in the first case, the operator== is used to perform the 
comparison and in the second case binary_pred is used to decide if two elements are the 



bool le\icographical_coinpare(lnputlteratorl flrstl, Inputlteratorl lastl 

Inputlterator2 flrst2. Input Iterator 2 lastl); 
bool lexicographical_compare(lnputlteratorl flrstl, Inputlteratorl lastl 

Inputlteratorl flrstl, inputlterator 2 lastl. Binary Predicate binary _pred); 
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These two functions determine if tiie first range is "lex icograpiiic ally less" than the second 
(they return true if range i is less than range 2, and false otherwise. Lexicographical equality, 
or "dictionary" comparison, means that the comparison is done the same way we establish the 
order of strings in a dictionary, one element at a time. The first elements determine the result 
if these elements are different, but if they're equal the algorithm moves on to the next 
elements and looks at those, and so on. until it fmds a mismatch. At that point it looks at the 
elements, and if the element from range 1 is less than the element from range two, then 
lexicographical_coinpare( ) returns true, otherwise it returns false. If it gets all the way 
through one range or the other (the ranges may be different lengths for this algorithm) without 
fmding an inequality, then range 1 is not less than range 2 so the function returns false. 

If the two ranges are different lengths, a missing element in one range acts as one that 
"precedes" an element that exists in the other range. So ('a', 'b' ] lexicographically precedes 

|v.-b%v|. 

In the first version of the function, operator< is used to perform the comparisons, and in the 
second version binary_pred is used. 

pair<lnputIteratorl, Input Iterate r2> niismatch(Inputlteratorl flrstl, 

Inputlt era tori lastl, Inputlteratorl flrstl); 
pair<lnputIteratorl, Input Iterate r2> niismatch(Inputlteratorl flrstl, 

Inputlteratorl lastl, Inputlteratorl firstl. Binary Predicate binary _pred); 

As in equal{ ), the length of both ranges is exactly the same, so only the first iterator in the 
second range is necessary, and the length of the first range is used as the length of the second 
range. Whereas equal( ) just tells you whether or not the two ranges are the same, 
niisniatch( ) tells you where they begin to differ. To accomplish this, you must be told ( 1 ) the 
element in the first range where the mismatch occurred and (2) the element in the second 
range where the mismatch occurred. These two iterators are packaged together into a pair 
object and returned. If no mismatch occurs, the return value is lastl combined with the past- 
the-end iterator of the second range. 

As in equal( ), the first function tests for equality using operator== while the second one 
usesbinary_pred. 



Example 



Because fhe sfandard C++ string class is built like a container (it has begin() and end() 
member functions which produce objects of type string: :iterator), it can be used to 
conveniently create ranges of characters to test with the STL comparison algorithms. 
However, you should note that string has a fairly complete set of native operalions, so you 
should look at the string class before using the STL algorithms to perform operations. 



// 


: C0 5:Comparison.cpp 


// 


The STL range compa! 


#il 


iclude "Print Sequence 


#i! 


iclude <vector> 


#ii 


iclude <algorithm> 


/ittri 
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^include <functional> 
#include <string> 
using namespace std; 

int mainl) { 

// ranges of characters, but you should 

// normally look for native string operations: 

cout « "si: " « si « endl 

« "s2: " « s2 « endl; 
cout << "compare si S si: " 

« equaKsl.beginO, sl.endO, si. begin!)) 

« endl; 
cout << "compare si a s2 : " 

« equal (si. begin () , si. endl) , s2. begin () ) 

« endl; 
cout << "lexicographical_compare si a si: " << 

lexicographical_compare (si .begin () , si .end () , 
sl.beginO, sl.end()) « endl; 
cout << "lexicographical_compare si a s2 : " << 

lexicographical_compare (si . begin ( ) , si. end ( ) , 
E2.beqln(|, a2.end()| « endl ; 
cout << "lexicographical_compare s2 a si: " << 

lexicographical_compare (s2 .begin () , s2 .end ( ) , 
Sl.beginO, sl.end()) « endl ; 
cout << "lexicographical_compare shortened " 

"si a full-length s2 : " « endl; 

while(s3. length != 0) { 

bool result = lexicographical_compare ( 

s3.begin(), s3.end(), s2 . begin (), s2 . end ()) ; 

cout « s3 « endl « s2 « ", result = " 
« result « endl; 

if (result == true) break; 

s3 = s3.substr(0, s3.1ength() - 1); 
1 

mismatch(3l.begin(), sl.end(), s2.begin()); 
print(p. first, sl.end(), "p. first", ""); 
print (p. second, s2.end(), "p . second" ,"") ; 

\ III:- 
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Note that the only difference between si and s2 is the capital 'T" in s2's "Test." Comparing si 
and si for equality yields true, as expected, while si and s2 are not equal because of the 
capital 'T'. 

To understand the output of the lexicographical_conipare( ) tests, you must remember two 
things: first, the comparison is performed character-by-character, and secoad that capital 
letters "precede" lowercase letters. In the first test, si is coiiq>ared to si. These are exactly 
equivalent, thus one is not lexicographically less than the other (which is what the comparison 
is looking for) and thus the result is false. The second test is asking "does si precede s2?" 
When the comparison gets to the 't' in "test", it discovers that the lowercase 't' in si is 
"greater" than the uppercase 'T' in s2, so the answer is again false. However, if we test to see 
whether s2 precedes si, the answer is true. 

To further examine lexicographical comparison, the next test in the above example compares 
si with s2 again (which returned false before). But this time it repeats the comparison, 
trimming one character off the end of si (which is first copied into s3) each time through the 
loop until the test evaluates to true. What you'll see is that, as soon as the uppercase 'T' is 
trimmed off of s3 (the copy of si), then the characters, which are exactly equal up to that 
point, no longer count and the fact that s3 is shorter than s2 is what makes it lexicographically 
precede s2. 

The final test uses inismatch( ). In order to capture the return value, you must first create the 
appropriate pair p, constructing the template using the iterator type from the first range and 
the iterator type from the second range (in this case, both string: iterators). To print the 
results, the iterator for the mismatch in the first range is p.first, and for the second range is 
p.second. In both cases, the range is printed from the mismatch iterator to the end of the range 
so you can see exactly where the iterator points. 



Removing elements 



B ecust of 111 e jeDirinlv of iht SIL, tlit (in cef I o f nm o v il is i b il co ii ilnin (d . S Id ci 
elen enls cm odI) be '[fi oreT vii ilcnlori, idJ iterito rs ( in point to irnj s, veclors, lists, 
etc., il is notsife oi [eisonible to utiiiliy try to deslroj tlie eletnents tint lu being lenoved, 
ind to cliin'e tbe size oltbe iipiilriiji [first, last) (an array, for example, cannot have its 
size changed). So instead, what the STL "remove" functions do is rearrange the sequence so 
that the "removed" elements are at the end of the sequence, and the "un-removed" elements 
are at the beginning of the sequence (in the same order that they were before, minus the 
removed elements — that is, this is a stable operation). Then the function will return an iterator 
to the "new last" element of the sequence, which is the end of the sequence without the 
removed elements and the beginning of the sequence of the removed elements. In other 
words, if newjast is the iterator that is returned from the "remove" function, then [first, 
ne«'_last) is the sequence without any of the removed elements, and [new_last, last) is the 
sequence of removed elements. 

If you are simply using your sequence, including the removed elements, with more STL 
algorithms, you can just use new_last as the new past-the-end iterator. However, if you're 
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using a resizable 
elements from the 



array) and you actually want to eliminate the removed 
use erase( ) to do so, for example: 



(c. begin 
i() is then 



value), c.endO); 
so erase( ) will delete all the removed 



values are undefined and 



The return value of 
elements from c. 

The iterators in [new_last, last) are dereference able but the elem 
should not be used. 

Forwardlterator remove(Forward Iterator first. Forward Iterator last, const T& value); 
Forwardllerator remove_if (For ward Iterator first, Forwardlterator last. 

Predicate pred); 
Output Iterator reiiiove_copy(InputIterator first, Inputlterator last, 

Outputlterator result, const T& value); 
Output Iterator reniove_copy_if(InputIterator first, Inputlterator last, 

Outputlterator result. Predicate pred); 

Each of the "remove" forms moves through the range [first, last), finding values that match a 
removal criterion and copying the un-removed elements over the removed elements (thus 
effectively removing them). The original order of the un-removed elements is maintained. 
The return value is an iterator pointing past the end of the range that contains none of the 
removed elements. The values that this iterator points to are unspecified. 

The "if versions pass each element to pred( ) to determine whether it should be removed or 
not (if pred( ) returns true, the element is removed). The "copy" versions do not modify the 
original sequence, but instead copy the un-removed values into a range beginning at result, 
and return an iterator indicating the pasl-lhe-end value of this new range. 

Forwardlterator unique(Forwardlterator first, Forwardlterator last); 
Forwardlterator iinique<Forward Iterator first, Forwardlterator last. 

Binary Predicate binary _pred); 
Outputlterator unique_copy (Input Iterator first, Inputlterator last, 

Outputlterator result); 
Outputlterator unique_copy (Input Iterator first, Inputlterator last, 

Outputlterator result, Binary Predicate binary _pred); 

Each of the "unique" functions moves through the range [first, last), finding adjacent values 
that are equivalent (that is, duplicates) and "removing" the duplicate elements by copying 
over them. The original order of the un-removed elements is maintained. The return value is 
an iterator pointing past the end of the range that has the adjacent duplicates removed. 



Because only duplicates that are adjacent t 
sort( ) before calling a "unique" algorithm 



3ved, it's likely that you'l 



:o call 



■e thai all the duplic 



containing binary _pred call, for each it 

y_pred(*i, * (i-1) ) ; 



/alue i in the input range: 
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and if the result is true then *(i-l) is considered a duplicate. 

The "copy" versions do not modify the origmal sequence, bul instead copy the un-removed 
values into a range beginning at result, and return an iterator indicating the past-the-end value 
of this new range. 

Example 

wort. 

// : C0 5:Removing.cpp 
// The removing algorithms 
linclude "PrintSequence . h" 
linclude "Generators .h" 
#include <vector> 
linclude <algorithm> 
linclude <cctype> 

struct IsUpper { 

bool operator 1) (char c) { 
return isupperlc); 



.nt mainl) { 

vector<char> vl50); 

generate (v. begin 1) , v.endO, CharGenO); 

print(v, "v", '■'■); 

// Create a set of the characters in v: 

set<char> cs (v . begin () , v.endO); 

3et<char>: : iterator it = cs.beginl); 

vector<char>: [iterator cit; 

// Step through and remove everything: 

while(it != cs.endO ) ! 

cit = remove (v. begin 0, v.endO, *it); 

cout « *it « "[" « '■cit « "] "; 



generate (v. begin 1) , v.endO, CharGenO); 

cit = remove_if (V. begin O , v.endO, IsUpperO); 
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print (v. begin 1) , cit, "after remove_if " , ""); 
// Copying versions are not shown for remove 
// and remove_if. 
3ort(v.begin(), cit); 

print(v.begin(), cit, "sorted", ""); 
vector<char> v2 ,■ 
unique_copy ( v . begin ( ) , cit, back_inserter lv2} ) ; 
print (v2, "unique_copy " , ""); 
// Same behavior: 

cit = unique (v. begin 1) , cit, equal_to<char> ( ) ) ; 
print (v. begin 1 ) , cit, "unique", ""); 
} III:- 

The vector<char> v is filled with randomly-generated characters and then copied into a set. 
Each element of the set is used in a remove statement, but the entire vector v is printed out 
each time so you can see what happens to the rest of the range, after the resulting endpoint 
(which is stored in cit). 

To demonstrate remove_if( ), the address of the Standard C library function isupper( ) (in 
<cctype> is called inside of the function object class IsUpper, an object of which is passed as 
the predicate for reinove_if( ). This only returns tme if a character is uppercase, so only 
lowercase characters will remain. Here, the end of the range is used in the call to print( ) so 
only the remaining elements will appear. The copying versions of reinove( ) and reniove_if( ) 
are not shown because they are a simple variation on the non-copying versions which you 
should be able to use without an example. 

The range of lowercase letters is sorted in preparation for testing the "unique" functions (the 
"unique" functions are not undefined if the range isn't sorted, but it's probably not what you 
want). First, unique_copy( ) puts the unique elements into a new vector using the default 
element comparison, and then the formof uiiique() that takes a predicate is used; the 
predicate used is the built-in function object equal_to( ). which produces the same results as 
the default element comparison. 

Sorting and operations on sorted ranges 

Tliert ii ) sljoificiDl citegcry of STL il^criiliit \ v ikl rtqnrt iHl tlie rinje llify ojierite on 
be in ioried older. 

Tbere is icUill) tsly one "son" ileorillin useil in tbe ST L . T Us ileoritli n is presii i ib ly Ibe 
f]sresl<]ie,liil rU ii plei eolei bis fiiily broid iifiride. H o w eier, il com es picHed in 
virion Ihvots depend in; ti whether tbe lorf ibonld be ilibk, pittiil or jnsl th re ;n lit so it. 
ii ly eiioii{b, only ihe pidial sorl his a copying version; olheiw ise you'll need lo i ike yonr 

of ilei s yon i ay be betler ofl trinsferrii; (hem to an irray (oi al least i vector, which uses 
an array internally) rather than using them in some of the STL ci 
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Once your sequence is sorted, there are many operations you can perform on that sequence, 
from simply locating an element or group of elements to merging with another sorted 
sequence or manipulating sequences as mathematical sets. 

Each algorithm involved with sorting or operations on sorted sequences has two versions of 
each function, the first that uses the object's own operate r< to perform the comparison, and 
the second that uses an additional StriclWeakOrdering object's operator( )(a, b) to compare 
two objects for a < b. Other than this there are no differences, so the distinction will not be 
pointed out in the description of each algorithm. 



Sorting 



One STL container (list) has its own built-in sort( ) fiinction which is almost certainly going 
to be faster than the generic sort presented here (especially since the list sort just swaps 
pointers rather than copying entire objects around). This means that you'll only want to use 
the sort functions here if (a) you're working with an array or a sequence container that doesn't 
have a sort( ) function or (b) you want to use one of the other sorting flavors, like a partial or 
stable sort, which aren't supported by list'ssort( ). 

void so rt(RandomAccess Iterator first, RandomAccesslterator last); 
void sort (RandomAccesslterator first, RandomAccesslterator last, 
StrictWeakOrdering binary _pred); 

Sorts [first, last) into ascending order. The second form allows a comparator object to 
determine the order. 

void stable_sort (RandomAccesslterator flrst, RandomAccesslterator last); 
void stable_sort (RandomAccesslterator first, RandomAccesslterator last, 
StrictWeakOrdering binary _pred); 

Sorts [first, last) into ascending order, preserving the original ordering of equivalent elements 
(this is important if elements can be equivalent but not identical). The second form allows a 
comparator object to determine the order. 

void pa rtial_sort (RandomAccesslterator first, 

RandomAccesslterator middle, RandomAccesslterator last); 
void pa rtial_sort (RandomAccesslterator first, 

RandomAccesslterator middle, RandomAccesslterator last, 

StrictWeakOrdering binary_pred); 

Sorts the number of elements from [first, last) thai can be placed in (he range [first, middle). 
The rest of the elements end up in [middle, last), and have no guaranteed order. The second 
form allows a comparator object to determine the order. 

RandomAccesslterator pariial_sort_copy(Inputlterator first, Inputlterator last, 

RandomAccesslterator result_flrst, RandomAccesslterator result_last); 
RandomAccesslterator pariial_sort_copy(Inputlterator first, 
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Inputlterator last, RandoniAcc ess Iterator result_first, 
RandomAccessIterator result_last, StrictWeakOrdering binary.pred); 

Sorts the number of elements from [first, last) that can be placed in the range [result_flrst, 
result_last), and copies those elements into [result_flrst, result_last). If the range [first, 
last) is smaller than [resylt_first, result_last). then the smaller number of elements is used. 
The second form allows a comparator object to determine the order. 

void iith_eleinent(RandoiiiAccessIterator flrst, 

RandomAccessIteratornth, RandomAccessIterator last); 
void iith_eleiiient(RandoiiiAccessIterator flrst, 

RandomAccessIterator nth, RandomAccessIterator last, 

StrictWeakOrdering binarj'_pred); 

Just like partiaI_sort( ), ntli_element( ) partially orders a range of elements. However, it's 
much "less ordered" than partiaI_sort( ). The only thing that nth_element( ) guarantees is 
that whatever location you choose will become a dividing point. All the elements in the range 
[first, nth) will he less than (they could also be equivalent to) whatever element ends up at 
location nlh and all the elements in the range (nth, last] will be greater than whatever element 
ends up location nth. However, neither range is in any particular order, unlike parlial_sort( ) 
which has the first range in sorted order. 

If all you need is this very weak ordering {if, for example, you're determining medians, 
percentiles and that sort of thing) this algorithm is faster than partial_sort( ). 



Example 



The St reamTokenizer class from the previous chapter is used to break a file into words, and 
each word is turned into an NString and added to a deque<NString>. Once the input file is 
completely read, a vector<NString> is created from the contents of the deque. The vector is 
then used to demonstrate the sorting algorithms: 

// : C0 5:SortTest.cpp 

//!L} . . /C04/StreamTokenizer 





ide 


". ./C04/StreamToken 




ide 


"NString. h" 




ide 


"Print Sequence. h" 




ide 


"Generators .h" 




ide 


" . . /require. h" 




ide 


<algorithm> 




ide 


<fstream> 




ide 


<queue> 




ide 


<vector> 




ide 


<cctype> 
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// For sorting NStrings and ignore string cas 
struct NoCase { 
bool operator () ( 

const NStrings x, const WEtringS y) { 
/* Somthing's wrong with this approach but I 
can't seem to see it. It would be much fas 
const strings Iv = x; 
const strings rv = y; 

int len = min (1 v . size () , rv.sizeO); 
for lint i = 0; i < len; i + +) 

if (tolower llv[i] ) < tolower ( rv [ i ] ) ) 



// Brute force: copy, force to lowe 

string rvly); 
lease (Iv), ■ 
lcase(rv); 



Iv 



id lease (strings s) { 



for lint i = 0; i < n; i + 
s[i] = tolower ls[i] ) ; 



nt main(int argc, char^ argv [ ] ) { 
requireArgs (argc, 1) ; 
ifstream in (argv [ 1 ] ) ; 
assure (in, argv[l]); 
StreamTokenizer words (in) ; 
deque <NString> nstr ; 
string word; 
while((word = words.next 1) ) .sizel) != 0) 

nstr .push_back (NString (word) ) ; 
print (nstr); 

// Create a vector from the contents of nstr 
vector<NString> v (nstr . begin () , nstr. end!)); 
sort(v.begin(), v.end()); 
print (v, "sort"); 
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// Use an additional comparator object: 

sort(v. begin 1) , v.endl), NoCase () ) ; 

print (v, "sort NoCase"); 

copy (nstr.beginO , nstr.endl), v.beginl) 

stable_sort(v. begin 0, v.endl) ) ; 

print (V, "stable_sort"); 



// 


Use an additional comparator 


object: 


stable_sort (v.beginO , v. 


5ndl 






greater<NString> ( ) ) ; 








pr 


Lnt(v, "stable_sort greater 


) ; 




copy (nstr .begin () , nstr .e 


idl) 


V 


.begin () ) ; 


// 


Partial sorts. The additior 


al 


coraparato 


// 


versions are obvious a 


id not 


shown here 


partial_sort (v.beginO , 










/.beginO + v. size () /2, 


v.er 


d 1 ) ) ; 


pr 


Lnt (v, "partial_sort") ; 








// 


Create a vector with a 


preall 


ocated siz 



vector<NString> v2 ( v . size ( ) /2 ) ; 
partial_sort_copy (v.begin () , v.endl) , 

v2. begin 0, v2.end()); 
print (v2, "partial_sort_copy " ) ; 
// Finally, the weakest form of ordering: 
vector<int> v3 (20); 

generate |v3 .begin 1) , v3 .end 1) , URandGen 150) ) ; 
print(v3, "v3 before nth_element" ) ; 
int n = 10; 

vector<int>: : iterator vit = v3.beginl) + n; 
nth_element(v3.begin(), vit, v3.endl)); 
cout << "After ordering with nth = " << n 

« ", nth element is " « v3 [n] « endl ; 
print (v3, "v3 after nth_element" ) ; 
} ///:- 

The first class is a binary predicate used to compare two NString objects while ignoring the 
case of the strings. You can pass the object into the various sort routines to produce an 
alphabetic sort (rather than the defauh lexicographic sort, which has all the capital letters in 
one group, followed by all the lowercase letters). 

As an example, try the source code for the above file as input. Because the occurrence 
numbers are printed along with the strings you can distinguish between an ordinary sort and a 
stable sort, and you can also see what happens during a partial sort (the remaining unsorted 
elements are in no particular order). There is no "partial stable sort." 
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You'll notice that the use of the second "comparator" forms of the functions are not 
exhaustively tested in the above example, but the use of a comparator is the same as in the 
first part of the example. 

The test of nth_element does not use the N String objects because it's simpler to see what's 
going on if ints are used. Notice that, whatever the nth element turns out to be (which will 
vary from one run to another because of URandGen), the elements before that are less, and 
afler that are greater, but the elements have no particular order other than that. Because of 
URandGen, there are no duplicates but if you use a generator that allows duplicates you can 
see that the elements before the nth element will be less than or equal to the nth element. 



Locating elements in sorted ranges 



Once a range is norted, there are a group of operalions that can be used to find elements within 
those ringes. In the following futictions, there are always two form s, one that assumes the 
intritisic operator< has been used to perform the sort, and the second that must be used if 
some other comparison function object has been used to perform the sort. You must use the 
same comparison for locating elements as you do to perform the sort, otherwise the results are 
undefined. In addition, if you try to use these functions on unsorted ranges the results will be 
undefined. 

bool binary_search (Forward Iterator flret, Forwardlterator last, const T& value); 
bool binary_search (Forward Iterator first, Forwardlterator last, const T& value, 
StrictWeakOrdering binary _pred); 

Tells you whether value appears in the sorted range [first, last). 

Forwardlterator lower_bound (Forwardlterator first, Forwardlterator last, 

const T& value); 
Forwardlterator lower_bound(ForwardIterator first, Forwardlterator last, 

const T& value, StrictWeakOrdering binary_pred); 

Returns an iterator indicating the first occurrence of value in the sorted range [first, last). 
Returns last if value is not found. 

Forwardlterator upper_bound(ForwardIterator first, Forwardlterator last, 

const T& value); 
Forwardlterator upper_bound(ForwardIterator first, Forwardlterator last, 

const T& value, StrictWeakOrdering binary_pred); 

Returns an iterator indicating one past the last occurrence of value in the sorted range [first, 
last). Returns last if value is not found . 

pair<ForwardIterator, ForwardIterator> 

e qua l_ range (Forward Iterator first, Forwardlterator last, 

const T& value); 
pair<ForwardIterator, ForwardIterator> 
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e qua l_ range (Forward Iterator first, Forwardlterator last, 
const T& value, StrictWeakOrdering binary_pred); 

Essentially combines lon'er_bound( ) and upper_bound( ) to return a pair indicating (he 
first and one -pa si -ihe -last occurrences of value in the sorted range [flrst, last). Both 
indicate last if value is not found. 



Example 



//: COS: 


ortedSearchTest . cpp 


//(L( .. 


CO 4 /StreamTokenizer 


// Test 


earching in sorted ranges 


#include 


". . /CO 4/ StreamTokenizer. h" 


#include 


"PrintSequence.h" 


#include 


"NString.h" 


#include 


" . . /require. h" 


#include 


<algorithm> 


#include 


<fstream> 


#include 


<queue> 


#include 


<vector> 


using na 


nespace std; 


int main 


) { 


ifstre 


im in ("SortedSearchTest .cpp") ; 


assure 


in, "SortedSearchTest .cpp") ; 


StreamTokenizer words (in); 


deque<NString> dstr; 



while((word = words. next 1) ) .size 1) != 0) 

dstr .push_back (NString (word) ) ; 
vector <NString> v (dstr .begin ( ) , dstr .end ( ) ) ; 
sort(v.begin(), v.endO); 
print (V, "sorted"); 



ypedef vector<NStri 


ig>: : 


it it, it2; 




tring f ("include") ; 





"binary search: " 

ary_search(v. begin 1) , v.endl), f) 



it = lower_boundlv.beginl), v.endl), f); 
it2 = upper_boundlv. begin 1) , v.endl), f); 
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equal_range (v. begin ( ) 


, V 


print lip. fir St, ip.seco 


nd. 


"equal_range" ) ; 




} ///:- 





The input is forced to be the source code for this file because the word "include" will be used 
for a find string (since "include" appears many times). The file is tokenized into words that 
are placed into a deque (a better container when you don't know how much storage to 
allocate), and left unsorted in the deque. The deque is copied into a vector via the 
appropriate constructor, and the vector is sorted and printed. 

The biiiary_search( ) function only tells you if the object is there or not; lower_bound( ) and 
upper_bound( ) produce iterators to the begmning and ending positions where the matching 
objects appear. The same effect can be produced more succinctly using equal_range( ) (as 
shown in the previous chapter, with multimap and multiset). 



Merging sorted ranges 



As before, the first form of each funcfion assumes fhe intrinsic operator< has been used to 
perform the sort. The second form must be used if some other comparison function object ha 
been used to perform the sort. You must use the same comparison for locating elements as 
you do to perform the sort, otherwise the results are undefined. In addition, if you try to use 
these functions on unsorted ranges the results will be undefined. 

Outputlterator inerge(lnputIteratorl firstl, Inputlteratorl lastl, 
Inputlterator2 flrstl, InputIterator2 last2, Outputlterator result); 

Outputlterator merge (Inputlteratorl flrstl, Inputlteratorl lastl, 
Inputlteratorl flrstl, Inputlteratorl lastl, Outputlterator result, 
StrictWeakOrdering binary _pred); 

Copies elements from [flrstl, lastl) and [flrstl, lastl) into result, such that the resulting 
range is sorted in ascending order. This is a stable operation. 

void inplace.mergelB id irectionallte rater flrst, 

Bidirectiona lite ra tor middle, Bidirectionallterator last); 
void ■nplace_merge(BidirectionalIterator flrst, 

Bidirectionallterator middle, Bidirectionallterator last, 

StrictWeakOrdering binary_pred); 

This assumes that [flrst, middle) and [middle, last) are each soiled ranges. The two ranges 
are merged so that the resulting range [first, last) contains the combined ranges in sorted 



Example 



to see ft'liatgoeson with merging if ints are used; the following example also 
s how the algorithms {and my own print template) work with arrays as well as 
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elude <algorithm> 
elude "PrintSequeni 
elude "Generators.! 



const int sz = 15; 

int a[sz'-2] = {0); 

// Both ranges go in the same array: 

generate (a, a + sz, SkipGenlO, 2)); 

generate (a + sz, a + sz'-2, SkipGenH, 3)); 

print(a, a + sz, "rangel", " " ) ; 

print (a + sz, a + sz'-2, '■range2", " "); 

int b[sz*2] = {01; // Initialize all to zero 

merge(a, a + sz, a + sz, a + sz^2, b); 

print(b, b + sz'-2, "merge", " "); 

set_unionla, a + sz, a + sz, a + sz'-2, b); 

inplace_mergela, a + sz, a + sz*2); 
print(a, a + sz^2, "inplace_merge" , " "); 

} ///:- 

In iiiain( ), instead of creating two separate arrays both ranges will be created end-to-end in 
the same array a (this will come in handy for the inplace_nierge). The first call to inerge( ) 
places the result in a different array, b. For comparison, set_union( ) is also called, which has 
the same signature and similar behavior, except that it removes the duplicates. Finally, 
inplace_nierge() is used to combine both parts of a. 

Set Operations on sorted ranges 

lice range? have been sorled , you can perform ni athera alical sel operations on them . 

bool includes(Inputlteratorl flrstl, Inputlteratorl lastl, 

Inputlterator2 Iirstl, Input Iterator 2 last2); 
bool includes (Inputlteratorl flrstl, Inputlteratorl lastl, 

Inputlterator2 lirst2, Input Iterator 2 last2, 

StrictWeakOrdering binarj'_pred); 

Returns true if [first2, Iast2) is a subset of [flrstl, lastl). Neither range is required lo hold 
only unique elements, but if [rirst2, last2) holds n elements of a particular value, then [firstl, 
lastl) must also hold n elements if the result is to be true. 
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Outputlterator set_uiiioii(Inputlteratorl flrstl, Inputlteratorl lastl, 
Inputlterator2 flrstl, Inputlteratorl last2, Outputlterator result); 

OutputIteratorset_uiiioii (Inputlteratorl flrstl, Inputlteratorl lastl, 
Inputlteratorl first!, Inputlteratorl lastl, Outputlterator result, 
StrictWeakOrdering binary_pred); 

Creates the mathemalical union of two sorted ranges in tlie result range, returning tlie end of 
the output range. Neitlier input range is required to liold only unique elements, but if a 
particular value appears multiple times in both input sets, then the resulting set will contain 
the larger number of identical values. 

Outputlterator set_lntersection (Inputlteratorl flrstl, Inputlteratorl lastl, 
Inputlteratorl flrstl, Inputlteratorl lastl, Outputlterator result); 

Outputlterator set_intersection (Inputlteratorl flrstl, Inputlteratorl lastl, 
Inputlteratorl flrstl, Inputlteratorl lastl, Outputlterator result, 
StrictWeakOrdering binary_pred); 

Produces, in result, the intersection of the two input sets, returning the end of the output 
range. That is, the set of values that appear in both input sets. Neither input range is required 
to hold only unique elements, but if a particular value appears multiple times in both input 
sets, then the resulting set will contain the smaller number of identical values. 

Outputlterator set_difTerence (Inputlteratorl flrstl, Inputlteratorl lastl, 
Inputlteratorl flrstl, Inputlteratorl lastl, Outputlterator result); 
Outputlterator set_difference (Inputlteratorl flrstl, Inputlteratorl lastl, 

Inputlteratorl flrstl, Inputlteratorl lastl, Outputlterator result, 

StrictWeakOrdering binary_pred); 

Produces, in result, Ihe mathematical set difference, returning the end of the output range. All 
the elements that are in [flrstl, lastl) but not in [flrstl, lastl) are placed in the result set. 
Neither input range is required to hold only unique elements, but if a particular value appears 
multiple times in both input sets (n times in set 1 and m times in set 2), then the resulting set 
will contain niax(n-m, 0) copies of that value. 

Outputlterator set_syninietric_difference(lnputltera tori flrstl, 

Inputlteratorl lastl, Inputlteratorl flrstl, Inputlteratorl lastl, 

Outputlterator result); 
Outputlterator set_syninietric_difference (Inputlteratorl flrstl, 

Inputlteratorl lastl, Inputlteratorl flrstl, Inputlteratorl lastl, 

Outputlterator result, StrictWeakOrdering binary _pred); 

Constructs, in result, the set containing: 

• All the elements in set I that are not in set 2 

• All the elements in set 2 that are not in set 1. 

Neither input range is required to hold only unique elements, but if a particular value appears 
multiple times in both input sets {n times in set I and m times in set 2), then the resulting set 
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will contain abs(ii-m) copies of that value, where abs( ) is the absolute value. The return 
value is the end of the output range 



Example 



Dived. 

// : CO 5: SetOperations . cpp 

// Set operations on sorted ranges 

#include <vector> 

#include <algorithni> 

#include "PrintSequence . h" 

#include "Generator s . h" 

using namespace std; 

vector<char> vl50), v2150); 

Char Gen g; 

generate(v.beginl), v.endl), g ) ; 

generate(v2.begin(), v2.end(), g); 

3ort(v.begin(), v.end()); 

sort(v2.begin(), v2.endl)); 

print(v, "v", ■■■■); 

print(v2, "v2", " " ) ; 

bool b = includes (v.beginO , v.endO, 

v.beginO +v.3ize()/2, v.endl)); 
cout « "includes: " « 

(b ? "true" : "false") « endl ; 
vector<char> v3 , v4 , v5, v6; 
set_union (V.beginO, v.endl), 

v2. begin , v2.end() , back_inserter 1 v3 ) ) ; 
print(v3, "set_union", " " ) ; 
set_intersection(v.begin () , v.endl) , 

v2. begin , v2 . end ( ) , back_inserter 1 v4 ) ) ; 
print(v4, "set_intersection", " " ) ; 
set_difference (V.beginO, v.endO, 

v2. begin , v2 . end () , back_inserter lv5) ) ; 
print (v5, "set_dif f erence" , "") ; 
3et_3Yininetric_difference (v. begin () , v. end () , 

v2. begin () , v2.end0 , back_inserter (v6) ) ; 
print lv6, " set_symmetric_diff erence" , "") ; 
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I } III:- 

After ¥ and vl are generated, sorted and printed, the inc]udes( ) algorithm is tested by seeing 
if the entire range of v contains the last half of v, which of course it does so the result should 
always be true. The vectors v3, v4, vS and ¥6 are created to hold the output of set_union( ), 
set_intersection( ), set_difrerence( ) and set_symnietric_difference( ), and the results of 
each are displayed so you can ponder them and convince yourself that the algorithms do 
indeed work as promised. 



Heap operations 



Tlie beip optntidiii ID tin STL lie prliD irily to nee rued « irh Ibe creitloD of 111 e STL 
priority_queue, which provides efficient access to the "large sf element, whatever "largest" 
happens to mean for your program. These were discussed in some detail in the previous 
chapter, and you can find an example there. 

As with the "sort" operations, there are two versions of each function, the first that uses the 
object's own operator< to perform the comparison, the second that uses an additional 
StrictWeakOrdering object's operator( )(a, b) to compare two objects for a < b. 

void inake_heap(RandoinAccesslterator first, RandomAccessIterator last); 
void niake_heap(RaiidoinAccesslterator flrst, RandomAccessIterator last, 
StrictWeakOrdering binary_pred); 

Turns an arbitrary range into a heap. A heap is just a range that is organized in a particular 

void push_heap(RaiidomAccesslterator first, RandomAccessIterator last); 
void push_heap(RaiidomAccesslterator first, RandomAccessIterator last, 
StrictWeakOrdering binary_pred); 

Adds the element *{last-l) to the heap determined by the range [first, last-1). Yes, it seems 
like an odd way to do things but remember that the priority _queue container presents the 
nice interface to a heap, as shown in the previous chapter. 

void pop_heap (RandomAccessIterator flrst, RandomAccessIterator last); 
void pop_heap (RandomAccessIterator flrst, RandomAccessIterator last, 
StrictWeakOrdering binary_pred); 

Places the largest element (which is actually in *flrst, before the operation, because of the 
way heaps are defined) into the position *(Iast-l) and reorganizes the remaining range so that 
it's still in heap order. If you simply grabbed *first, the next element would not be the next- 
largest element so you must use pop_heap( ) if you want to maintain the heap in its proper 
priorily-queue order. 

void so rt_heap (RandomAccessIterator flrst, RandomAccessIterator last); 
void so rt_heap (RandomAccessIterator first, RandomAccessIterator last, 
StrictWeakOrdering binary _pred); 
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This could be thought of as the complement of make_lieap( ), since it takes a range that is in 
heap order and turns it into ordinary sorted order, so it is no longer a heap. That means that if 
you call sort_heap( ) you can no longer use push_heap( ) or pop_heap( ) on that range 
(rather, you can use those functions but they won't do anything sensible). This is not a stable 



Applying an operation to each element 



m a range 



Tbey diflei in wiianlie)' do « ith lie results o f tb il opnilion : for_each() discards the return 
value of the operation (but returns the function object that has been applied to each element), 
while traiisfonn( ) places the results of each operation into a destination sequence (which can 
be the original sequence). 

Unary Function for_each(Input Iterator first, Inputlterator last, Unary Function f); 

Applies the function object f to each element in [flrst, last), discarding the return value from 
each individual application off. Iff is just a function pointer then you are typically not 
interested in the return value, but if f is an object that maintains some internal state it can 
capture the combined return value of being applied to the range. The final return value of 
for_each( ) is f. 

Output Iterator transform (Inputlterator flrst, Inputlterator last, 

Outputlterator result, UnaryFunction f); 
Outputlterator transfomi(lnputIteratorl flrst, Inputlteratorl last, 

lnputlterator2 flrstl, Outputlterator result. Binary Function f); 

Like for_each( ), transform( ) applies a function object f to each element in the range [flrst, 
last). However, instead of discarding the result of each function call, transforni() copies the 
result (using openitor=) into *result, incrementing result after each copy (the sequence 
pointed to by result must have enough storage, otherwise you should use an inserter to force 
insertions instead of assignments). 

The first form of transforni( ) simply calls f( ) and passes it each object from the input range 
as an argument. The second form passes an object from the first input range and one from the 
second input range as the two arguments to the binary function f (note the length of the 
second input range is determined by the length of the first). The return value in both cases is 
the past-the-end iterator for the resulting output range. 



Examples 
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First, consider for_each( ). This sweeps tlirougli tlie range, pulling out each element and 
passing it as an argument as it calls whatever function object it's been given. Thus for_each( ) 
performs operations that you might normally write out by hand. In Stishape.cpp. for 
example: 

fordter j = shapes . begin () ; 
j != shapes, end 0; j + +) 
delete *j; 

If you look in your compiler's header file at the template defining for_each{ ), you'll see 
something like this: 



Function f looks at first like it must be a pointer to a function which takes, as an argument, an 
object of whatever Inputlterator selects. However, the above template actually only says that 
you must be able to call f using parentheses and an argument. This is true for a function 
pointer, but it's also true for a function object - any class that defines the appropriate 
operator( ). The following example shows several different ways this template can be 
expanded. First, we need a class that keeps track of its objects so we can know that it's being 
properly destroyed : 

I //: C05:Counted.h 

// An object that keeps track of itself 

lifndef COUNTED_H 
I Idefine COUNTED_H 

linclude <vector> 

linclude <iostream> 

class Counted { 



public: 

Counted (char*- 
-Counted ! 
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int Counted: :count = ; 

class CountedVector : 

public std: :vector<Counted*> { 
public: 

CountedVector (char* id) { 

push_back (new Counted ( id) ) ; 



lendif // COUWTED_H / / / : - 



The class Counted keeps a static count of how many Counted objects have been created, and 
tells you as ihey are destroyed. In addition, each Counted keeps a char* identifier to make 
tracking the output easier. 

The CountedVector is inherited from vector<Counted*>, and in the constructor it creates 
some Counted objects, handing each one your desired char*. The CountedVector makes 
testmg quite simple, as you'll see. 

// : C0 5:ForEach.cpp 

// Use of STL f or_each 1 ) algorithm 
linclude "Counted. h" 

#include <vector> 

#include <algorithm> 
using namespace std; 

// Simple function: 

void destroy (Counted* f p ) { delete f p ; ) 

// Function object: 
template<class T> 
class DeleteT { 

public: 

void operator 1) (T* x) { delete x; ) 

(; 

// Template function: 

template <class T> 

void wipelT* X) { delete x; } 

int mainl) { 

CountedVector AC'one"); 

for_each (A. beg in ( ) , A. end ( ) , destroy) ; 
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CountedVector C("three"); 

for_each (C. begin () , C . end ( ) , wipe<Counted> ) ; 

} ///:- 

In iiiain( ), the first approach is the simple pointer-to-function, which works but has the 
drawback that you must write a new Destroy function for each different type. The obvious 
solution is to make a template, which is shown in the second approach with a templatized 
function object. On the other hand, approach three also makes sense: template functions work 

Since this is obviously something you might want to do a lot, why not create an algorithm to 
delete all the pointers in a container? This was accomplished with the purge( ) template 
created in the previous chapter. However, that used explicitly-written code; here, we could 
use lransform( ). The value of tniiisfomi( ) over for_each( ) is that lransform( ) assigns the 
result of calling the fiinction object into a resulting range, which can actually be the input 
range. That case means a literal transformation for the input range, since each element would 
be a modification of its previous value. In the above example this would be especially useful 
since it's more appropriate to assign each pouiter to the safe value of zero after calling delete 
for that pointer. Traiisfomi( ) can easily do this: 

// : CO 5 [Transform. cpp 

// Use of STL transform 1) algorithm 

#include "Counted. h" 

#include <iostream> 

#include <vector> 

#include <algorithm> 

using namespace std; 

template<class T> 

T* deletePlT* x) { delete x; return 0; 1 



r 1) (T*- X) { dele 



nt mainO { 
CountedVector cv("one" 
transform(cv.begin(), 

deleteP<Counted>) ; 
CountedVector cv2("two 
transform(cv2.begin(), 

Deleter<Counted> ( ) ) ; 
III:- 
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This shows both approaches: using a template function or a tempjatized function object. After 
the call to transfonii( ), the vector contains zero pointers, which is safer since any duplicate 
deletes will have no effect. 

One thing you cannot do is delete every pointer in a collection without wrapping the call to 
delete inside a function or an object. That is, you don't want to say something like this: 

I for_each(a.beginl), a . end () , ptr_fun (operator delete )) ; 

You can say it, but what you'll get is a sequence of calls to the function that releases the 
storage. You will not get the effect of calling delete for each pointer in a, however; the 
destructor will not be called. This is typically not what you want, so you will need wrap your 
calls to delete. 

In the previous example of for_each(), the return value of the algorithm was ignored. This 
return value is the function that is passed in to for_each( ). If the function is just a pointer to a 
function, then the return value is not very useful, but if it is a function object, then that 
function object may have internal member data that it uses to accumulate information about 
all the objects that it sees during for_each( ). 

For example, consider a simple model of inventory. Each Inventory object has the type of 
product it represents (here, single characters will be used for product names), the quantity of 
that product and the price of each item: 

//: C05:Inventory.h 
#ifndef INVENTOR¥_H 
#define INVENTOR¥_H 
linclude <iostream> 
#include <cstdlib> 
#include <ctime> 



int quantity; 
int value; 
public: 

Inventory (char it, int quant, int val ) 

: item(it), quantity (quant ) , value (val) (} 
// Synthesized operator- S copy-constructor C 

int getQuantity () const ( return quantity; 1 
void setQuantity (int q) ( quantity = q; } 
int getValueO const ( return value; } 
void 3etValue(int val) ( value = val; } 
friend std : : ostreamE operator<< ( 

std: :ostream6 os, const Inventory^ inv) { 
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:nGen () | std: :srand (std: :t 
^ntory operator ( ) () { 
,atic char c = ' a ' ; 
it q = std: :randl) % 100; 
it V = std: :randl) % 500; 
.turn Inventory lc++, q, v) ; 



#endif 



IWVEWTORY_H ///:- 



There are member functions to get the item name, and to get and set quantity and value. An 
operator« prints the Inventory object to an ostream. There' s also a generator that creates 
objects that have sequentially -labeled items and random quantities and values. Note the use o 
the return value optimization in operator( ). 



To find out the total number of items and total value, you c 
with for_each( ) that has data members to hold the totals: 

// : C0 5:CalcInventory .cpp 
// More use of f or_each ( ) 
#include "Inventory . h" 
#include "PrintSequence . h" 

#include <algorithm> 
using namespace std; 



a function object to u 



class InvAccum { 

int quantity; 

int value; 
public: 

InvAccum 1) : quantity (0), value ( ) {) 
void operator 1) (const InventoryS inv) { 

quantity += inv . getQuantity ( ) ; 

value += inv. getQuantity * inv . getValue () ; 
1 
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OS « "total quantity: " 
a. quantity 
, total value: " « ia. value; 



generate_n(back_inserter (vi) , 15, InvenGen ( ) ) ; 

print(vi, "vi"); 

InvAccum ia = f or_each ( vi . begin (), vi . end () , 

InvAccum () ) ; 
cout « ia « endl; 
} III:- 

InvAccum's operator( ) takes a single argument, as required by for_each( ). As for_each( ) 
moves through its range, it takes each object in that range and passes it to 
InvAccum: :opera tor (), which performs calculations and saves the result. At the end of this 
process, for_each( ) returns the InvAccum object which you can then examine; in this case it 
is simply printed. 

You can do most things to the Inventory objects using for_each{ ). For example, if you 
wanted to increase all the prices by 10%, for_each( ) could do this handily. But you'll notice 
that the Inventory objects have no way to change the item value. The programmers who 
designed Inventory thought this was a good idea, after all, why would you want to change the 
name of an item? But marketing has decided that they want a "new, improved" look by 
changing all the item names to uppercase; they've done studies and determined that the new 
names will boost sales (well, marketing has to have somethmg to do . . , ). So for_each( ) will 
not work here, but transfomi( ) will: 

// : C0 5:TransformNames . cpp 
// More use of transform () 
#include "Inventory . h" 
#include "PrintSequence . h" 
#include <vector> 
#include <algorithm> 
#include <cctype> 
using namespace std; 

struct Newlmproved ! 

Inventory operator () (const InventoryS inv) | 

return Inventory (toupper (inv . getltem ( ) ), 

inv . getQuantity ( ) , inv . getValue ( ) ) ; 
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mainl) { 

ector<Inventory> vi ; 

rint(vi, "vi"); 
ransform(vi.beginl), 
Newlmproved ( ) ) ; 



t (v 






rlvi), 15, 
i.endl), V 



Noiice that the resulting range is the sa 
performed in -place. 

Now suppose that the sales department 
s for each item. The original li 



s the input range, that is, the transformation is 



: special price lists with different 
: must stay the same, and there need to be any number 



of generated special lists. Sales will give you a separate list of discounts for each n 
solve this problem we can use the second version of traiisfomi( ): 

// : C0 5:SpecialList.cpp 

// Using the second version of transform () 

#include "Inventory . h" 

linclude "PrintSequence . h" 

linclude <algorithm> 
linclude <cstdlib> 
linclude <ctime> 



truct Discounter { 
Inventory operator ( ) (c 
float discount) ! 
return Inventory (inv 

inv.getQuantity () , 
inv.getValue 1) * 1 



truct DiscGen { 
DiscGenl) { srand ( time ( ) ) ; ) 
float operator () () { 

float r = floatlrandl) % 10); 
return r / 10 0.0; 
) 
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int mainO | 

vector<Inventory> vi ; 

generate_n (back_inserter (vi) , 15, InvenGen ( ) ) ; 

print(vi, "vi"); 

vector<float> disc; 

generate_n (back_inserter (disc) , 15, DiscGen ( ) ) ; 

print (disc, "Discounts : ") ; 

vector<Inventory> discounted; 

transform(vi. begin (),vi.end(), disc.beginl), 
back_inserter (discounted) , Discounter ( ) ) ; 

print (discounted, "discounted") ; 
( ///:- 

Discounter is a function object that, given an Inventory object and a discount percentage, 
produces a new Inventory with the discounted price. DiscGen just generates random discount 
values between I and 10 percent to use for testing. In niain( ), two vectors are created, one 
for Inventory and one for discounts. These are passed to transfomi( ) along with a 
Discounter object, and transforni( ) fills a new vector<Inventory> called discounted. 

Numeric algorithms 

T 1 est ilj titli m s i re ill liid ed in lo ih ( lidd er <nunieric>, since they are primarily useful for 
performing numerical calculations. 



T accumulate (I nputltera tor first, Inputlterator last, T result); 
T accuniulate(InputIterator flrst, Inputlterator last, T result. 
Binary Function f); 

The first form is a generalized summation; for each element pointed to by an iterator i in 
[first, last), it performs the operation result = result + *i, where result is of type T. 
However, the second form is more general; it applies the function f(result, *i) on each 
element *1 in the range from beginning to end. The value result is initialized in both cases by 
resniti, and if the range is empty then resultl is returned. 

Note the similarity between the second form of transfomi( ) and the second form of 
accumulate ( ). 



r inner_product(lnputlteratorl flrstl, Inputlteratorl lastl, 

InputIterator2 first!, T init>; 
r inner_p rod uct(lnputltera tori flrstl, Inputlteratorl lastl, 

Inputlteratorl flrst2, T init 

BinaryFunctlonl opl, BinaryFuiiction2 op2); 
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opUi 


lit. 


op2 (1 


1) ) ; 


opUi 


lit. 


op2 (1 


2) ) ; 


opUi 


lit. 


op2 (2 


3) ) ; 


opUi 


lit. 


op2 (2 


4) ) ; 


ar to transformf ) but t 


wo ope 



Calculates a generalized inner product of the two ranges [firstl, lastl) and [first!, flrstZ + 
(lastl - firstl)). The return value is produced by multiplying the element from the first 
sequence by the "parallel" element in the second sequence, and then adding it to the sum. So 
if you have two sequences (I, 1, 2, 2) and { 1, 2, 3, 4| the inner product becomes: 

I (1*1) + (1*2) + (2*3) + (2*4) 

Which is 17. The init argument is the initial value for the inner product; this is probably zero 
but may be anything and is especially important for an empty first sequence, because then it 
becomes the default return value. The second sequence must have at least as many elements 
as the first. 

While the first form is very specifically mathematical, the second form is simply a multiple 
application of functions and could conceivably be used in many other situations. The opl 
function is used in place of addition, and op2 is used instead of multiplication. Thus, if you 
applied the second version of inner_product() to the above sequence, the result would be the 
following operations: 



s are performed instead of one. 

Output Iterator pa rtial_suiii(Inpu (Iterator first, Inputlterator last, 

Outputlterator result); 
Output Iterator pa rtial_su in (Inputlterator first, Inputlterator last, 

Outputlterator result, Binary Function op); 

Calculates a generalized partial sum. This means that a new sequence is created, beginning at 
result, where each element is the sum of all the elements up to the currently selected element 
in [first, last). For example, if the original sequence is {1, 1, 2, 2, 3} then the generated 
sequence is (1,1 + 1,1+1+2,1 + 1 + 1+2 + 2,1+1 + 1 + 2 + 2 +3|. that is. (1, 2, 4, 6, 
9}. 

In the second version, the binary function op is used instead of the + operator to take all the 
"summation" up to that point and combine it with the new value. For example, if you use 
niultipUes<int>( ) as the object for the above sequence, the output is {1, 1, 2, 4, 12}. Note 
that the first output value is always the same as the first input value. 

The return value is the end of the output range [result, result + (last -first) ), 

<nDnieric> 

Outputlterator adjacent_difference(lnputlterator first, Inputlterator last, 

Outputlterator result); 
Outputlterator adjacent_difference(lnputlterator first, Inputlterator last, 

Outputlterator result, Binary Function op); 
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Calculates the differences of adjacent elements throughout the range [first, last). This means 
that in the new sequence, the value is the value of the difference of the current element and 
the previous element in the original sequence (the first value is the same). For example, if the 
original sequence is {1, 1, 2, 2, 3}, the resulting sequence is {1, 1 — 1, 2 - 1, 2 — 2, 3 - 2}, that 
is: {1,0.1,0,1). 

The second form uses the binary function op instead of the - operator to perform the 
"differencing." For example, if you use iiiultiplies<int>( ) as the function object for the above 
sequence, the output is (1, 1, 2, 4, 6}. 

The return value is the end of the output range [result, result + (last -first) ). 

Example 

This program tests all the algorithm s in <nunieric> in both forms, on integer arrays. You'll 
notice that in the test of Ihe form where you supply the function or functions, the function 
objects used are the ones that produce the same result as form one so the results produced will 
be exactly the same. This should also demonstrate a bit more clearly the operations that are 
going on, and how to substitute your own operations. 

// : C0 5:NumericTest.cpp 
#include "PrintSequence . h" 
# include <numeric> 
linclude <algorithm> 
linclude <io3tream> 
#include <iterator> 
#incliide <fi]iictional> 
using namespace std; 

int mainl) { 

int a[] = { 1, 1, 2, 2, 3, 5, 7, 9, 11, 13 ); 

cout « "accumulate 1: " « r « endl ; 
// Should produce the same result: 

cout « "accumulate 2: " « r « endl ; 
int b[] = ! 1, 2, 3, 4, 1, 2, 3, 4, 1, 2 1; 
printlb, b + sizeof b / sizeof b[0], "b", " "); 
r = inner_product (a, a + asz, b, 0); 
cout « "inner_product 1: " « r « endl; 
// Should produce the same result: 
r = inner_product (a, a + asz, b, 0, 
plus<int>(), multiplies<int>() ) ; 
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// Should produce the same result: 

it = partial_sum(a, a + asz, b, plus<int> () ) ; 

printlb, it, "partial.sum 2", " " ) ; 

it = adjacent_difference (a, a + asz, b ) ; 

printlb, it, "ad jacent_dif f erence 1"," " ) ; 

// Should produce the same result: 

it = adjacent_difference (a, a + asz, b, 

print (b, it, " ad jacent_diff erence 2", " " ) ; 
III:- 

Note that the return value of inner_pn»duct() and partial_suin() is the past-the-end iterator 
for the resulting sequence, so it is used as the second iterator in the print( ) function. 

Since the second form of each function allows you to provide your own function object, only 
the first form of the functions is purely "numeric." You could conceivably do some things that 
are not intuitively numeric with something like iiuier_product( ). 



General utilities 



<utility> 
struct pair; 
nmke_pair( ); 

This was described and used in the previous chapter and in this one. A pair is simply a way to 
package two objects (which may be of different types) together into a single object. This is 
typically used when you need to return more than one object from a function, but it can also 
be used to create a container that holds pair objects, or to pass more than one object as a 
single argument. You access the elements by saying p.flrst and p,second, where p is the pair 
object. The function eqDal_range( ), described in the last chapter and in this one, returns its 
result as a pair of iterators. You can insert( ) a pair directly into a map or multimap; a pair 
is the yalue_type for those containers. 

If you want to create a pair, you typically use the template function iiiake_pair( ) rather than 
explicitly consfructing a pair object. 



< iterate r> 

distance (Input Iterator first, Inputlterator last); 
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Tells you the number of elements between first and last. More precisely, it returns an integral 
value that tells you the number of times first must be incremented before it is equal to last. 
No dereferencing of the iterators occurs during this process. 

< iterate r> 

void advance(InputIterator& i, Distance n); 

Moves the iterator i forward by the value of n {the iterator can also be moved backward for 
negative values of n if the iterator is also a bidirectional iterator). This algorithm is aware of 
bidirectional iterators, and will use the most efficient approach. 

<iterator> 

back_insert_itera to r< Container > back_inserter(Container& x); 
froiit_insert_iterator<Container> front_inserter(Conlainer& s); 
iiisert_iterator<Container> inserter(Coiitainer& x. Iterator i); 

These functions are used to create iterators for the given containers that will insert elements 
into the container, rather than overwrite the existing elements in the container using 
operator= (which is the default behavior). Each type of iterator uses a different operation for 
insertion: back_iiisert_iterator uses push_back( ), front_insert_iterator uses 
push_froDt( ) and insert_iterator uses insert( ) (and thus it can be used with the associative 
containers, while the other two can be used with sequence containers). These were shown in 
some detail in the previous chapter, and also used in this chapter. 

const LessTlianConiparable& niin(const LessThanConiparable& a, 

const LessThanConiparable& b); 
const T& niin(const T& a, const T& b, Binary Predicate binary.pred); 

Returns the lesser of its two arguments, or the first argument if the two are equivalent. The 
first version performs comparisons using operator< and the second passes both arguments to 
binary_pred to perform the comparison. 

const LessTlianConiparable& niax(const LessThanComparable& a, 

const LessThanConiparable& b); 
const T& niax(const T& a, const T& b. Binary Predicate binary_pred); 



Exactly like niin( ), but returns the greater of it 



void swap(Assignable& a, Assignable& b); 

void iter_swap(ForwardIteratorl a, Forwardlteratorl b); 

Exchanges the values of a and b using assignment. Note that all container classes use 
specialized versions of swap( ) that are typically more efficient than this general version. 

iter_swap( ) is a backwards-compatible renmant in the standard; you can just use swap( ). 
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Creating your own STL-style 
algorithms 

Once you become comfortable with the STL algorithm style, you can begin to create your 
own STL-slyle algorithms. Because these will conform to the format of all the other 
algorithms in the STL, they're easy to use for programmers who are familiar with the STL, 
and thus become a way to "extend the STL vocabulary." 

The easiest way to approach the problem is to go to the <algoritlim> header file and find 
something similar to what you need, and modify that (virtually all STL implementations 
provide the code for the templates directly in the header files). For example, an algorithm that 
stands out by its absence is copy_if( ) (the closest approximation is partition( )), which was 
used in Binderl.cpp at the beginning of this chapter, and in several other examples in this 
chapter. This will only copy an element if it satisfies a predicate. Here's an implei' 

//: C05:copy_if .h 

// Roll your own STL-style algorithm 

#ifndef COPY_IF_H 

#define COPY_IF_H 

template<typename Forwardlter , 

typename Outputlter, typename UnaryPred> 
Outputlter copy_if (Forwardlter begin, Forwardlti 
Outputlter dest, UnaryPred f) { 
while (begin != end) | 
if (f l*begin) ) 

'■de3t + + = '■begin; 
begin++; 
1 



ir for the destination sequence (the copied 



#endif // COPY_IF_H III:- 

The return value is the past-the-end iterate 
sequence). 

Now that you're comfortable with the ideas of the various iterator types, the actual 
implementation is quite straightforward. You can imagine creating an entire additional library 
of your own useful algorithms that follow the format of the STL. 
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Summary 



The goal of this chapter, and the previous one, was to give you a programmer' s-depth 
understanding of the containers and algorithms in the Standard Template Library. That is, to 
make you aware of and comfortable enough with the STL that you begin to use it on a regular 
basis (or at least, to think of using it so you can come back here and hunt for the appropriate 
solution). It ispowerfiil not only because it's a reasonably complete library of tools, but also 
because it provides a vocabulary for thinking about problem solutions, and because it is a 
framework for creating additional tools. 

Although this chapter and the last did show some examples of creating your own tools, I did 
not go into the full depth of the theory of the STL that is necessary to completely understand 
all the STL nooks and crannies to allow you to create tools more sophisticated than those 
shown here. 1 did not do this partially because of space limitations, but mostly because it is 
beyond the charter of this book; my goal here is to give you practical understanding that will 
affect your day-to-day programming skills. 

There are a number of books dedicated solely to the STL (these are listed in the appendices), 
but the two that 1 learned the most from, in terms of the theory necessary for tool creation, 
were first. Generic Programming and the STL by Matthew H. Austern, Addison-Wesley 1999 
(this also covers all the SGI extensions, which Austern was instrumental in creating), and 
second (older and somewhat out of date, but still quite valuable), C++ Programmer's Guide 
to the Standard Template Library by Mark Nelson, IDG press 1995. 



Exercises 



Create a generator that returns the current value of clock()(in <ctime>). 
Create a list<clock_t> and fill it with your generator using generate_n( ). 
Remove any duplicates in the list and print it to cout using copyC ). 
Modify Stlshape.cpp from chapter XXX so that it uses transfomi() to 
delete all its objects. 

Using transfomi( ) and toupper( ) (in <cctype>) write a single function 
call that will convert a string to all uppercase letters. 

Create a Sum fti net ion object template that will accumulate all the values in 
a range when used with for_each( ). 

Write an anagram generator that takes a word as a command-line argument 
and produces all possible permutations of the letters. 
Write a "sentence anagram generator" that takes a sentence as a command- 
line argument and produces all possible permutations of the words in the 
sentence (it leaves the words alone, just moves them around). 
Create a class hierarchy with a base class B and a derived class D. Put a 
virtual member function void f() in B such that it will print a message 
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indicating tliat B's f( ) lias been called, and redefine this function for D to 

print a different message. Create a deque<B*> and fill it with B and D 

objects. Use for_each( ) to call f( ) for each of the objects in your deque. 

Modify FunctionObjects.cpp so that it uses float instead of int. 

Modify FunctionObjects.cpp so that it templatizes the main body of tests 

so you can choose which type you're going to test (you'll have to pull most 

of ■iiain( ) out into a separate template function). 

Using transfomi( ), toupper( ) and tolower( ) (in <ccytpe>), create two 

functions such that the first takes a string object and returns that string with 

all the letters in uppercase, and the second returns a string with all the 

letters in lowercase. 

Create a container of containers of Noisy objects, and sort them. Now write 

a template for your sorting test (to use with the three basic sequence 

containers), and compare the performance of the different container types. 

Write a program that takes as a command line argument the name of a text 

file. Open this file and read it a word at a time (hint: use »). Store each 

word into a deque<string>. Force all the words to lowercase, sort them, 

remove all the duplicates and print the results. 

Write a program that finds all the words that are in common between two 

input files, using set_intersection( ). Change it to show the words that are 

not in common, using set_syninietric_difference( ). 

Create a program that, given an integer on the command line, creates a 

"factorial table" of all the factorials up to and including the number on the 

command line. To do this, write a generator to fill a vector<int>, then use 

partial_suni( ) with a standard function object. 

Modify Cak Inventory .cpp so that it will find all the objects that have a 

quantity that's less than a certain amount. Provide this amount as a 

command-line argument, and use copy_if( ) and bind2nd( ) to create the 

collection of values less than the target value. 

Create template function objects that perform bitwise operations for &, I, '' 

and -. Test these with a bitset. 

Fill a vector<double> with numbers representing angles in radians. Using 

function object composition, take the sine of all the elements in your vector 

(see <cmath>). 

Create a map which is a cosine table where the keys are the angles in 

degrees and the values are the cosines. Use transfomi( ) with cos( ) (in 

<cmath>) to fill the table. 

W rhe a program to compare the speed of sorting a list using list::sori( ) vs. 

using sld!:sort() (the STL algorithm version ofsort()). Hint: seethe 

timing examples in the previous chapter. 

Create and test a logical_xor function object template to implement a 

logical exciusive-o/-. 
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Create an STL-slyle algorithm transforni_if( ) following the first form of 
tran5fomi( ) which only performs transformations on objects that satisfy a 
unary predicate. 

Create an STL-sty!e algorithm which is an overloaded version of 
for_each( ) that follows the second form of transfomi( ) and takes two 
input ranges so it can pass the objects of the second input range a to a 
binary function which it applies to each object of the first range. 
Create a Matrix class which is made from a vector<vector<int> >. Provide 
it with a friend ostreani& operator«(ostreani&, const Matrix&) to 
display the matrix. Create the following using the STL algorithms where 
possible (you may need to look up the mathematical meanings of the matrix 
operations if you don't remember them): operator-i-(const Matrix&, const 
Matrix&) for Matrix addition, ope mtor* (const Matrix&, const 
vector<mt>&) for multiplying a matrix by a vector, and operator* (const 
Matrix&, const Matrix&) for matrix multiplication. Demonstrate each. 
Teniplalize the Matrix class and associated operations from the previous 
exaii:q>le so they will work with any appropriate type. 
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Part 2: Advanced 
Topics 



6: Multiple 
inheritance 



The basic concept of multiple inheritance (MI) sounds 
simple enough. 



1. Demo of use of MI, using Greenhouse example and different company's greenhouse 
controller equipment. 

2. Introduce concept of interfaces; toys and "tuckable" interface 
]]] 

You create a new type by inheriting from more than one base class. The syntax is exactly 
what you'd expect, and as long as the inheritance diagrams are simple, MI is simple as well. 

However, MI can introduce a number of ambiguities and strange situations, which are covered 
in this chapter. But first, it helps to get a perspective on the subject. 



Perspective 



[itiltl f[D I lii {[D I il D p n II lii;i i; t. II i! gflti [tltiidlD II pure, whereas C-I-I-, 
because it was built on top of C, is called hybrid. One of the design decisions made with 
Smalltalk was that all classes would be derived in a single hierarchy, rooted in a single base 
class (called Object - this is the model for the object-based hierarchy). You cannot create a 
new class in Smalltalk without inheriting it from an existing class, which is why it takes a 
certain amount of time to become productive in Smalltalk - you must learn the class library 
before you can start making new classes. So the Smalltalk class hierarchy is always a single 
monolithic tree. 

Classes in Smalltalk usually have a number of things in common, and always hav& some 
things in common (the characteristics and behaviors of Object), so you almost never run into 
a situation where you need to inherit from more than one base class. However, with C++ you 
can create as many hierarchy trees as you want. Therefore, for logical completeness the 



language must be able to combine more than one class at a time - thus the need for multiple 
inheritance. 

However, this was not a crystal-clear case of a feature that no one could live without, and 
there was (and still is) a lot of disagreement about whether MI is really essential in C++. Ml 
was added in AT&T cfront release 2.0 and was the first significant change to the language. 
Since then, a number of other features have been added (notably templates) that change the 
way we think about programming and place MI in a much less important role. You can think 
of MI as a "minor" language feature that shouldn't be involved in your daily design decisions. 

One of the most pressing issues that drove MI involved containers. Suppose you want to 
create a container that everyone can easily use. One approach is to use void* as the type 
inside the container, as with PStash and Stack. The Smalltalk approach, however, is to make 
a container that holds Objects. (Remember that Object is the base type of the entire Smalltalk 
hierarchy.) Because everything in Smalltalk is ultimately derived from Object, any container 
that holds Objects can hold anything, so this approach works nicely. 

Now consider the situation in C++. Suppose vendor A creates an object-based hierarchy that 
includes a useful set of containers including one you want to use called Holder. Now you 
come across vendor B's class hierarchy that contains some other class that is important to 
you, a Bitlmage class, for example, which holds graphic images. The only way to make a 
Holder of Bitlmages is to inherit a new class from both Object, so it can be held in the 
Holder, and Bitlmage: 




This was seen as an important reason for MI, and a number of class libraries were built on thi 
model. However, as you saw in Chapter XX, the addition of templates has changed the way 
containers are created, so this situation isn't a driving issue for MI. 

The other reason you may need MI is logical, related to design. Unlike the above situation, 
where you don't have control of the base classes, in this one you do, and you intentionally ust 
Ml to make the design more flexible or useful. (At least, you may believe this to be the case.) 
An example of this is in the original iostream library design: 
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Both istream and ostream are useful classes by themselves, but they can also be inherited 
into a class that combines both their characteristics and behaviors. 



Regardlessof what n 

you need to understand them to u 



Duplicate subobjects 



dl and class d2 into class n 
mi object looks like this: 



•Atiiti II n I jHfcoi/eff. If you multiply inherit from clas 
mi contains one subobject of dl and one of d2. So your 



Now consider what happens if dl and d2 both inherit from the same base class, called Base: 
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In the above diagram, both dl and d2 contain a subobject of Base, sc 
subobjectsof Base. Because of the path produced in the diagram, this is sometimes called a 
"diamond" in the inheritance hierarchy. Without diamonds, multiple inheritance is quite 
straightforward, but as soon as a diamond appears, trouble starts because you have duplicate 
subobjects in your new class. This takes up extra space, which may or may not be a problen 
depending on your design. But it also introduces an ambiguity. 



Ambiguous upcasting 



1 


hi iipprH. in lit jh w d.i j;li[ . .1 '. )D i 


i: i;. uii 1 [..; 


ii: It M mi 


to a pointer to a 


Base? There are two subobjects of type Base, 


so which addre 


ss does thee 


St produce? Here's 


the diagram in code: 










//: CO 6: Multiple Inheritance! 


.cpp 








// MI fi ambiguity 










# in elude " . . /purge . h" 










#include <io3tream> 










#include <vector> 










using namespace std; 










class MBase { 










public: 










virtual char* vf ( ) const = 


0; 








virtual -MBase 1) {) 
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class Dl : public MBa 
char* vf ( ) const { 



class D2 : public MBa 
public: 

char*- vf 1) const { 



// Causes error: ambiguous override of vf ( ) : 
//! class MI : public Dl, public D2 {]; 

vector<MBase'-> b; 
b.push_back (new Dl) ,■ 
b . push_back (new D2); 
// Cannot upcast: which subobject?: 

//! b.push_back(new mi ) ; 

for(int i = 0; i < b.sizeO; i + +) 

cout « bli]->vf() « endl; 
purge (b ); 

) III:- 

Two problems occur here. First, you cannot even create the class mi because doing so would 
cause a clash between the two definitions of vf() in Dl and D2. 

Second, in the array definition for b[ ] this code attempts to create a new mi and upcast the 
address to a MBase*. The compiler won't accept this because it has no way of knowing 
whether you want to use Dl's subobject MBase or D2's subobject MBase for the resulting 
address. 



virtual base classes 



The solution to the second problem is a language extension: The meaning of the virtual 
keyword is overloaded, if you inherit a base class as virtual, only one subobject of that class 
will ever appear as a base class. Virtual base classes are implemented by the compiler with 
pointer magic in a way suggesting the implementation of ordinary virtual functions. 
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Because only one subobject of a virtual base class will ever appear during multiple 
inheritance, there is no ambiguity during upcasting. Here's an example: 



/ 


: C06:MultipleInheritaT 


ce2 


cpp 


/ 


Virtual base 


clas 


es 






#1 


nclude ". . 


purge. h 








#1 


nclude <io 


tr 


iam> 








#1 


nclude <vector> 








u. 


ing namespace 


std; 








PL 


ass MBase 
blic: 














virtual char*- 


vfl) 


con 


t = 


0; 


1 


virtual -MBase () { 








cJ 


ass Dl : v 


rt 


.al PL 


blic MBase { 


P^ 


blic: 












\ 


char* vfl) 


CO 


ist { 


retL 


rn 


Dl"; 


cl 


ass D2 : v 


rt 


Lai PL 


blic MBase { 


PL 


blic: 














char*- vfl) 


CO 


LSt { 


retL 


rn 


D2"; 



// MUST explicitly disambiguate vfl): 
class MI : public Dl, public D2 { 
public: 

char*- vfl) const { return Dl : : vf 1 ) ; } 



int mainl) { 

vector<MBase*> b; 

b.push_back(new Dl ) ; 

b.push_back(new 02); 

b.push_back(new MI); //OK 

for lint i = 0; i < b.sizel); 
cout « b(i]->vf I) « endl 

purge lb) ; 
} ///:- 



The compiler now accepts the upcast, but notice that you must still explicitly disambiguate the 
function vf( ) in MI; otherwise the compiler wouldn't know which version to use. 
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The "most derived" class and virtual 
base initialization 

The use of virtual base classes isn't quite as simple as that. The above example uses the 
(compiler-synthesized) default constructor, if the virtual base has a constructor, things 
become a bit strange. To understand this, you need a new term: most-derived class. 

The most-derived class is the one you're currently in, and is particularly important when 
you're thinking about constructors. In the previous example, MBase is the most-derived class 
inside the MBase constructor. Inside the Dl constructor, Dl is the most -derived class, and 
inside the MI constructor, MI is the most-derived class. 



When you are using a virtual base class, the most-derived constructor is responsible for 
initializing that virtual base class. That means any class, no matter how far away it is from the 
virtual base, is responsible for initializing it. Here's an example: 



// 


C06:MultipleInhe 


// 


Virtual base init 


// 


Virtual base clas 


// 


Initialized by th 


#i 


elude " . . /purge . h 


#i 


elude <iostream> 


#i 


elude <vector> 



class Dl : virtual public MBase { 
public: 

Dl 1) : MBase (1) { ) 

char*- vfl) const { return "Dl"; 



class D2 : virtual public MBase { 
public: 

D2 1) : MBase (2) { ) 

char*- vfl) const { return "D2"; 

); 
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class MI : public Dl, public D2 { 
public: 

MI 1) : MBase (3) { ) 

char-- vf 1) const { 

return Dl::vfl); // MUST disambigua 



class X : public MI { 
public: 

// You must ALWAYS i 
XI) : MBasel4) {) 



int mainO { 

vector<MBase*> b; 

b.push_back (new Dl ) ; 

b.push_back(new D2 ) ; 

b.push_back(new MI) ; // OK 

b.push_back (new X) ; 

for(int i = 0; i < b.sizeO; i + +) 
cout « b[i]->vf() « endl; 

purge (b) ; 
( ///:- 

As you would expect, both Dl and D2 must initialize MBase in their constructor. But so mui 
MI and X, even though they are more than one layer away! That's because each one in turn 
becomes the most-derived class. The compiler can't know whether to use Dl's initialization 
of MBase or to use D2's version. Thus you are always forced to do it in the most-derived 
class. Note that only the single selected virtual base constructor is called. 

"Tying off" virtual bases with a default 
constructor 

F ore id; llie id o il-d etii ed tins Ic iDitiiiiie i virtni bist tliil n i) be buied iitf in tbe diss 
liietHcliy cu seem like i lid ions ind coii fii sin^ tisl: to |iiil iipcn tbt y set o I y o n r c liss. Il'i 
bellet 10 I ike III is ii v isib le , ¥ li ic li is doDt by neitins i d f fu lUc n strii do r [o r 111 e viilnl 
bise chss.like this: 

// : C0 6:MultipleInheritance4 . cpp 
// "Tying off virtual bases 

// so you don't have to worry about them 
// in derived classes 
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^include " . . /purge. h" 
#include <io3tream> 
#include <vector> 
using namespace std; 

class MBase { 
public: 
// Default constructor removes responsibility: 

MBase (int = 0) !] 

virtual char* vf ( ) const = 0; 

virtual -MBase () { 1 



class Dl : virtual public MBase { 
public: 

Dl 1) : MBase (1) { ) 

char*- vfl) const { return "Dl"; 



class D2 : virtual public MBase { 
public: 

D2 1) : MBase (2) { ) 

char-- vfl) const { return "D2"; 



class MI : public Dl, public D2 { 
public: 

MID {} // Calls default constructor 

char*- vfl) const { 

return Dl::vfl); // MUST disambigua 



class X : public MI { 
public: 

XI) {1 // Calls default 



nt mainl) | 
vector<MBase*> b; 
b.push_back(new Dl); 
b.push_backlnew D2 ) ; 

b.push_back Inew MI); // OK 
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b.push_back (new X) ; 

forlint i = 0; i < b.sizel); i++) 

cout « b[i]->vf 1) « endl; 
purge (b); 

} ///:- 

If you can always arrange for a virtual base class to have a default 
things much easier for anyone who inherits from that class. 



Overhead 



in f Iti (ltd. )' gg Ml stt lit plinitil ginhid gf 
[[gjni : 

// : COS [Overhead. cpp 

// Virtual base class overhead 

linclude <fstream> 

using namespace std; 

public: 

virtual void f () const { } ; 
virtual -MBase () { 1 



class NonVirtuallnheritance 
: public MBase { ) ; 

class Virtuallnheritance 

: virtual public MBase { ] ; 

class Virtuallnheritance2 
: virtual public MBase { ] ; 

class MI 

: public Virtuallnheritance, 

public Virtuallnheritance2 {]; 

Idefine WRITE (ARC) \ 

out << #ARG << " = " << ARC << endl; 
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int mainl) { 

MBase b; 

WRITE (sizeof (b) ) ; 

Nonvirtual Inheritance nonv_inher itance ; 

WRITE (sizeof (nonv_inheritance ) ) ; 

Virtuallnheritance v_inheritance; 

WRITE (sizeof ( v_inheritance ) ) ; 

MI mi; 

WRITE (sizeof (mi) ) ; 
} III:- 

Each of these classes only conlains a single byte, and the "core size" is that byte. Because all 
these classes contain virtual functions, you expect the object size to be bigger than the core 
size by a pointer (at least - your compiler may also pad extra bytes into an object for 
alignment). The results are a bit surprising (these are from one particular compiler; yours may 
do it differently): 

izeof(b) = 2 

izeof (nonv_inheritance) = 2 

izeof (v_inheritance) = 6 

izeof (MI) = 12 

Both b and nonv_iDheritance contain the extra pointer, as expected. But when virtual 
inheritance is added, it would appear that the VPTR plus two extra pointers are added! By the 
time the multiple inheritance is performed, the object appears to contain five extra pointers 
(however, one of these is probably a second VPTR for the second multiply inherited 
subobject). 

The curious can certainly probe into your particular implementation and look at the assembly 
language for member selection to determine exactly what these extra bytes are for, and the 
cost of member selection with multiple inheritance'^. The rest of you have probably seen 
enough to guess that quite a bit more goes on with virtual multiple inheritance, so it should be 
used sparingly (or avoided) when efficiency is an issue. 



Upcasting 



[ 1 1 p ill [. f It I lit , till II h g h jiil In ih it i Ihis pointer, and as long as you're dealing 
with member objects, everything is quite straightforward. But as soon as multiple inheritanci 



'^ See also Jail Gray, "C + + Under the Hood" . achapier in Black Belt C++ (edited by Bru 
Eckel, M&T Press, 1995). 
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is introduced, a funny thing occurs: An object can have more than one this pointer because 
the object represents more than one type during upcasting. The following example 
demonshates this point: 

// : C06:Mithis . cpp 

// MI and the "this" pointer 

linclude <fstream> 



char c[OxlO] ; 
public: 

void printthisl 1) { 
out « "Basel this 



char dOxlO] ; 
public: 

void printthis2 () { 
out « "Base2 this 



class Memberl { 

char c[OxlO] ; 
public: 

void printthisml () { 

out << "Memberl this = " << this << endl ; 



class Member2 { 

char c[OxlO] ; 
public: 

void printthism2 () { 

out << "Member2 this = " << this << endl ; 



class MI : public Basel, public Base2 { 
Memberl ml ; 
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public: 



ml.pr 
m2.pr 



« hex « sizeoflmi) « " hex" « endl; 
mi.printthisO; 
// A second demonstration: 
Basel* bl = Smi ; // Upcast 
Base2* b2 = Smi; // Upcast 

out « "Base 1 pointer = " « bl « endl; 
out « "Base 2 pointer = " « b2 « endl; 
} ///:- 

The arrays of bytes inside each class are created with hexadecimal sizes, so the output 
addresses (which are printed in hex) are easy to read. Each class has a function that prints its 
this pointer, and these classes are assembled with both multiple mheritance and composition 
into the class MI, which prints its own address and the addresses of all the other subobjects. 
This function is called in niain( ). You can clearly see that you get two different this pointers 
for the same object. The address of the MI object is taken and upcast to the two different 
types. Here's the output:^" 

sizeof (mi) = 40 hex 
mi this = 0x223e 
Basel this = 0x223e 
Base2 this = 0x224e 
Memberl this = 0x225e 
Member2 this = 0x226e 
Base 1 pointer = Ox223e 
Base 2 pointer = Ox224e 



For easy readability the code was generated for a sinall-model Intel processor. 
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Although object layouts vary from compiler to compiler and are not specified in Standard 
C++, this one is fairly typical. The starting addressof the object corresponds to the address of 
the first class in the base-class list. Then the second inherited class is placed, followed by the 
member objects in order of declaration. 

When the upcast to the Basel and Basel pointers occur, you can see that, even though they're 
ostensibly pomting to the same object, they must actually have different this pointers, so the 
proper starting address can be passed to the member functions of each subobject. The only 
way things can work correctly is if this implicit upcasting takes place when you call a member 
function for a multiply inherited subobject. 



Persistence 



persist en ce. 

The lifetime of a local object is the scope in which it is defined. The lifetime of a global 
object is the lifetime of the program. A persistent object lives between invocations of a 
program: You can normally think of it as existing on disk instead of in memory. One 
definition of an object-oriented database is "a collection of persistent objects." 

To implement persistence, you must move a persistent object from disk into memory in order 
to call functions for it, and later store it to disk before the program expires. Four issues arise 
when stormg an object on disk: 



1. 


The object iii 


lUStb 


e converted froi 


n its representation in nieir 


2. 


Because the 
program is ir 


value 
ivoke 


s of any pointer, 
d, these pointer; 


5 in memory won't have nn 
s must be converted to soni 


3. 


What the poi 


nteis 


point to must also be stored and retrieved. 



4. When restoring an object from disk, the virtual pointers in the object must be 

respected. 

Because the object must be converted back and forth between a layout in memory and a serial 
representation on disk, the process is called serialization (to write an object to disk) and 
deserialization (to restore an object from disk). Although it would be very convenient, these 
processes require too much overhead to support directly in the language. Class libraries will 
often build in support for serialization and deserialization by adding special member functions 
and placing requirements on new classes. (Usually some sort of serialize( ) fiinction must be 
written for each new class.) Also, persistence is generally not automatic; you must usually 
explicitly write and read the objects. 
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Ml-based persistence 



Consider sidestepping the pointer issues for now and creating a class that installs persistence 
into simple objects using multiple inheritance. By inheriting the persistence class along with 
your new class, you automatically create classes that can be read from and written to disk. 
Although this sounds great, the use of muhiple inheritance introduces a pitfall, as seen in the 
following example. 

// : C06: Persia tl . cpp 

// Simple persistence with MI 

#include ".. /require . h" 

#include <iostream> 

#include <fstream> 

using namespace std; 



{ 



int objSize; // S 
public: 



s(o 



stored objec 

) : objSizelsz) {} 
mfi out) const { 
( (char*)thi3, ObjSize); 



ad(i3treamG in) { 

ad ( (char*) this, objSize) ; 



float f [3]; 










blic: 












Data (fl 


oat fO 


= 


.0, 


float 


fl 


float 


f2 = 


0.0 


{ 






f [0] 


= fO; 










f [1] 


= fl; 










f [2] 


= f2; 










void pr 


mt (CO 


nst 


cha 


r* msg 


= 


if C-m 


sg) CO 


lit « m 


sg « ' 




for (i 




0; 


i < 


3; i + 4 


) 


cou 




f [" 


« 


i « '■] 


= 




« f 


[i] 


« 


endl ; 
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class WDatal : public Persistent, public 

public: 

WDatal (float fO = 0.0, float fl = 0.0, 
float f2 = 0.0) : DatalfO, fl, f 2 ) , 
Persistent (sizeof (WDatal) ) { ) 



WData2 : public Data, public Persis 

a2 (float fO = 0.0, float fl = 0.0, 
oat f2 = 0.0) : Data(fO, fl, f 2 ) , 
rsistent (sizeof (WData2) ) { ) 



of 


tream fl ("fl.dat 


), f2("f2 


d 


assure(fl, "fl.dat" 


; assure (f2 


WDatal dl (1.1, 2.2, 


3.3) ; 




WData2 d2 (4.4, 5.5, 


6.6); 




dl 


print ("dl before 


storage") 




d2 


print ("d2 before 


storage") 




dl 


write (fl); 






d2 


write (f2) ; 






1 // 


Closes files 






ifst 


ceam fl ("fl.dat") 


f2 ("f2.dat 


assu 


i:e(fl, "fl.dat"); 


assure (f2 




WDatal dl; 






WData2 d2 ; 






dl.r 


Bad(fl) ; 






d2.r 


Bad(f2) ; 






dl. print ("dl after storage"); 




d2.print("d2 after storage"); 




} ///: 









In this very simple version, the Persistent:: read < ) and Persistent: :w rite ( ) functions lake the 
this pointer and call iostreain read( ) and write( ) functions. (Note that any type of iostream 
can be used). A more sophisticated Persistent class would call a virtual write( ) function for 
each subobject. 

With the language features covered so far in the book, the number of bytes in the object 
cannot be known by the Persistent class so it is inserted as a constructor argument. (In 
Chapter XX, run-time type identification shows how you can find the exact type of an object 
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given only a base poin 
the sizeof operator.) 



The Data class contains no pointers or VPTR, so there is no danger in simply writing it to 
disk and reading it back again. And it works fine in class WDatal when, in niain( ), it's 
written to file Fl .DAT and later read back again. However, when Persistent is second in the 
inheritance list of WDalal, the this pointer for Persistent is offset to the end of the object, so 
it reads and writes past the end of the object. This not only produces garbage when reading 
the object from the file, it's dangerous because it walks over any storage that occurs after the 

This problem occurs in multiple inheritance any time a class must produce the this pointer for 
the actual object from a subobject'slhis pointer. Of course, if you know your compiler always 
lays out objects in order of declaration in the inheritance list, you can ensure that you always 
put the critical class at the beginning of the list (assuming there's only one critical class). 
However, such a class may exist in the inheritance hierarchy of another class and you may 
unwittingly put it in the wrong place during multiple inheritance. Fortunately, using run-time 
type identification (the subject of Chapter XX) will produce the proper pointer to the actual 
object, even if multiple inheritance is used. 



Improved persistence 



cteile virtnal fuDctioDS m Ihe bus class for reading and writing and then require the creator of 
any new cla[;s that ni ust be streamed to redefine these function!;. The argum en t to Ihe function 
is the stream object to write to or read from .^' Then the creator of the class, who knows best 
how the new parts should be read or written, is responsible for making the correct fiinction 
calls. This doesn't have the "magical" quality of the previous example, and it requires more 
coding and knowledge on the part of the user, but it works and doesn't break when pointers 
are present: 

// : C0 6:Per3ist2 . cpp 

// Improved MI persistence 

linclude ".. /require . h" 

linclude <iostream> 

linclude <f3tream> 

linclude <c3tring> 

using namespace std; 






informalLoii about wliether you're reading o 



a single function for streaming, and Ihe argumenl 
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tual void 
tual -Pers 



otected: 




float f [3]; 




blic: 




Datalfloat fO = 0.0, float fl = 


0.0, 


float f2 = 0.0) { 




f[0] = fO; 




f[l] = fl; 




f[2] = f2; 




void print (const char* msg = "" 


con 


if C-msg) cout « msg « endl ; 




for lint i = 0; i < 3; i + +) 




cout « "ft" « i « "] = " 




« f[i] « endl; 





class WDatal : public Persistent, public Data { 

WDatal (float fO = 0.0, float fl = 0.0, 
float f2 = 0.0) : DatalfO, fl, f 2 ) {) 

void write (ostreamS out) const { 
out « f[0] « '■ '■ 

« f[l] « '■ '■ « f[2] « " "; 

1 

void readlistreamS in) { 

in » f [0] » f [1] >> f [2] ; 



class WData2 : public Data, public Persis 
public: 

WData2 (float fO = 0.0, float fl = 0.0, 

float f2 = 0.0) : Data(fO, fl, f 2 ) {} 
void write (ostreamfi out) const { 
out « f[0] « '■ '■ 

« f[l] « " " « f[2] « '■ '■; 
} 
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id readlistream& in) { 

in >> f [0] >> f [1] >> f [2] ; 



class Conglomerate : public Data 
public Persistent { 

char* name; // Contains a poin 

WDatal dl; 

WData2 d2 ; 
public: 

Conglomerate (const char* nm = 
float fO = 0.0, float fl = 
float f2 = 0.0, float f3 = 
float f4 = 0.0, float f5 = 
float f6 = 0.0, float f7 = 
float f8= 0.0) : DatalfO, fl, f 2 ) , 
dllf3, f4, f5), d2 (f 6, fl, fS) | 
name = new char [ strlen (nm) + 1 ] ; 
strcpylname, nm) ; 

1 

void write (ostreamfi out) const { 
int i = strlen (name) + 1; 
out « i « '■ '■; // Store size of 

dl.write(out); 

d2.writelout) ; 

out « f[0] « " " « f[l] « " " 
1 

// Must read in same order as write: 
void readlistreams in) { 

delete (]name; // Remove old stora 

in » i » ws; // Get int, strip w 

name = new char[i] ; 

in.getline (name, i) ; 

dl.read(in); 

d2.read(in) ; 

in » f [0] » f [1] >> f [2] ; 



oid print 1) const { 
Data: : print (name ) ; 
dl. print 1) ; 
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int mainO { 
{ 

ofstream data (" data . dat" ) ; 
assure (data, "data. dat") ; 

Conglomerate Cl"This is Conglomerate C", 
1.1, 2.2, 3.3, 4.4, 5.5, 
6.6, 7.7, 8.8, 9.9) ; 
cout « "C before storage" « endl ; 
C. print 1) ; 
C. write (data); 
) // Closes file 
ifstream data ( "data . dat" ) ; 
assure (data, "data. dat") ; 
Conglomerate C ; 
C. read (data) ; 

cout « "after storage: " « endl ; 
C. print 1) ; 
} ///:- 

The pure virtual functions in Persistent must be redefined in the derived classes to perform 
the proper reading and writing. If you already knew that Data would be persistent, you could 
inherit directly from Persistent and redefine the functions there, thus eliminating the need for 
multiple inheritance. This example is based on the idea that you don't own the code for Data, 
that it was created elsewhere and may be part of another class hierarchy so you don't have 
control over its inheritance. However, for this scheme to work correctly you must have access 
to the underlying implementation so it can be stored; thus the use of protected. 



The classes WDatal and WData2 use familia 
retrieve the protected data in Data to and from the iostream object. In write( ), you can see 
that spaces are added after each floating point number is written; these are necessary to allow 
parsing of the data on input. 

The class Conglomerate not only inherits from Data, it also has member objects of type 
WDatal and WDalal, as well as a pointer to a character string. In addition, all the classes 
that inherit from Persistent also contain a VPTR, so this example shows the kind of problem 
you'll actually encounter when using persistence. 

When you create write( ) and read( ) function pairs, the read( ) must exactly mirror what 
happens during the write( ), so read( ) pulls the bits off the disk the same way they were 
placed there by \*rite( ). Here, the first problem that's tackled is the char*, which points to a 
string of any length. The size of the string is calculated and stored on disk as an int (followed 



Chapter 15: Multiple Iiiliei 



by a space to enable parsing) to allow the read( ) function to allocate the correct amount of 
storage. 

When you have subobjects that have read( ) and «Tite( ) member functions, all you need to 
do is call those functions in the new read( ) and H'rite( ) functions. This is followed by direct 
storage of the members in the base class. 

People have gone to great lengths to automate persistence, for example, by creating modified 
preprocessors to support a "persistent" keyword to be applied when defining a class. One can 
imagine a more elegant approach than the one shown here for implementing persistence, but it 
has the advantage that it works under all implementationsof C++, doesn't require special 
language extensions, and is relatively bulletproof. 



Avoiding MI 



] i t D t ( i 111 I 1 1 1 i p I ( i 1 h [ . 1 1 1 ( I i ] Persistl.cpp is contrived, based on the concept that you 
don't have control of some of the code in the project. Upon examination of the example, you 
can see that MI can be easily avoided by using member objects of type Data, and putting the 
virtual read( )and write( ) members inside Data or WDatal and WDatal rather than in a 
separate class. There are many situations like this one where multiple inheritance may be 
avoided; the language feature is included for unusual, special-case situations that would 
otherwise be difficult or impossible to handle. But when the question of whether to use 
multiple inheritance comes up, you should ask two questions: 

1. Do 1 need to show the public interfaces of both these classes, or could one 
class be embedded with some of its interface produced with member 
functions in the new class? 

2. Do Ineed to upcast to both of the base classes? (This applies when you 
have more than two base classes, of course.) 

If you can't answer "no" to both questions, you can avoid using MI and should probably do 

One situation to watch for is when one class only needs to be upcast as a function argument. 
In that ease, the class can be embedded and an automatic type conversion operator provided in 
your new class to produce a reference to the embedded object. Any lime you use an object of 
your new class as an argument to a function that expects the embedded object, the type 
conversion operator is used. However, type conversion can't be used for normal member 



Mixin types 
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interfaces in general 



Repairing an interface 

(lit gl lit htil i[;di iH^ Nr i illicit lilcrlliiw iiuliti tilt ll: 

D;i iM'Tt itqiiiitd 1 liS[i[) ihl tgiiiils il i ItHtr lili ii 

iDII, hi ID SDHIt tHt ft! I ti hi fiiilitis. I \\i lihii) 

I fll[liDn,)lJ illMllil! !•■ t {iDhlfnditii Ihl llli I 

Itt lil[i[);thl il, il nil ih lihni )hjjcli ftl^i t [p k ic i II) . N g 
[ititD niiH (III lihn). ih i [lit iggi t) i itli Ihl nil 
nphull,. 



Iliri gil.lpntgil 
tlDlMi lg ih hst (Il 



Il III diTilgpi tDl ol Hi ngjtil gi HI till t hiig; ill g i ii ii g i g 1 1 . j o g d istgui ihl 
lit hit'tliii iilt[li:i piiilJH h' ih wihi is liioi pltlt: 1 I 
< pi ntJ II It h Tliliil, ti I linn I 111 din li t g d p It li 1) i Inlii ii Ih liliifiii, hi 
It g til Ml Ih Hlhlgg h i g g [ p [g I It i . 11 j g g lid lit initt iglt. pi (till p hil iH 
I 11 |g. t II pi d«ri, 111 in hit I lgl gf tils tig; ith Hit hpiih n Ih til; In I 
Infiii. H Mi, I illlph Ighillntt I; Ih ptiltd Hlillgi. 



;ih 



nj joi icqi 



// : C06:Vendor .h 

// Vendor-supplied class header 

// You only get this S the compiled Vendor. obj 

#ifndef VENDOR_H 

#define VENDOR_H 

class Vendor { 
public: 

virtual void vl) const; 

void f const; 

-Vendor () ; 

(; 

class Vendorl : public Vendor { 
public: 

void v() const; 

void f const; 

-Vendorl (); 

}; 

void A(const Vendors); 
void B (const Vendors); 
// Etc. 



Chapter 15: Multiple Iiiliei 



#endif // VENDOR_H III:- 

Assume the library is much bigger, with more derived classes and a larger interface. Notice 
that it also includes the functions A( ) and B( ), which take a base pointer and treat it 
polymorphic ally. Here's the implementation file for the library: 

// : CO 6: Vendor .cpp {01 

// Implementation of VENDOR. H 

// This is compiled and unavailable to you 

#include "Vendor. h" 

#include <fstream> 

using namespace std; 



id Vendor: :vl) const { 
out « "Vendor: :vl)\n"; 



id Vendor: :f () const { 
out « "Vendor: :f l)\n" 



Vendor: : -Vendor ( ) { 
out « "-Vendor () \n 



id Vendorl: :vl) const { 
out « "Vendorl: :vl)\n"; 



id Vendorl: :f () const { 
out « "Vendorl: :f l)\n"; 



Vendorl : : -Vendorl ( ) { 
out « "-Vendorl l)\n"; 



old A (const Vendors V) ! 
// ... 

V . V 1) ; 

V . f 1) ; 
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void B (const Vendors V} { 
// ... 

V . V 1) ; 

V . f 1 ) ; 

( III:- 

In your project, this source code is unavailable to you. Instead, you get a compiled file as 
Vendor.obj or Vendor.lib (or the equivalent for your system). 

The problem occurs in the use of this library. First, the destructor isn't virtual. This is acti 
a design error on the part of the library creator. In addition, f() was not made virtual; assi 
the library creator decided it wouldn't need to be. And you discover that the interface lo t 
base class is missing a function essential to the solution of your problem. Also suppose 
you've already written a fair amount of code using the existing interface (not to mention t 
functions A() and B(), which are out of your control), and you don't want to change it. 

To repair the problem, create your own class interface and multiply inherit a new set of 
derived classes from your interface and from the existing classes: 



// 


C06:Pas 


//(L) 


Ve 


do 


// 


Fi 


■cing a 


#i 


icl 


ide 


"V 


#i 


icl 


jde 


<f 



class MyBase { // Repa 

public: 

virtual void vl) con 
virtual void f () con 
// New interface fum 
virtual void g() con. 
virtual -MyBase 1) { . 



class Pastel : public MyBa 
public: 

void vl) const { 

out « "Pastel: :vl)\n" 

Vendorl: :vl) ; 
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void f() const ! 

out « "Pastel: :f l)\n"; 
Vendorl : :f 1) ; 

1 

void gl) const { 

out « "Pastel: :g () \n" ; 
1 
-Pastel 1) { out « "-Paste 



int mainl) { 

PastelE pip = '■new Pastel; 
MyBaseS mp = pip; // Upcast 
out « "calling f()\n"; 
mp.fO; // Right behavior 
out « "calling g()\n"; 
mp . g ( ) ; // New behavior 
out « "calling A(plp)\n"; 
A (pip); // Same old behavior 
out « "calling B(plp)\n"; 
B(plp); // Same old behavior 

// Deleting a reference to a heap object: 
delete Smp; // Right behavior 
} ///:- 

In MyBase (which does not use MI), both f( ) and the deshuctor are now virtual, and a new 
virtual function g( ) has been added to the interface. Now each of the derived classes in the 
original library must be recreated, mixing in the new interface with MI. The functions 
Pastel ::v() and Paste l::f()need to call only the original base-class versions of their 
functions. But now, if you upcast to MyBase as in niaiii( ) 

MyBase* mp = pip; // Upcast 
any function calls made through mp will be polymorphic, including delete. Also, the new 
interface function g{ ) can be called through mp. Here's the output of the program: 

ailing f () 
a3tel::f 
endorl: :f () 
ailing g() 

astel: :g () 
ailing Alplp) 
astel: :vl) 
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Vendorl: :vl) 
Vendor: :f () 
calling B(plp) 
Pa3tel::v() 
Vendorl : :v() 
Vendor: :f () 
delete mp 
-Pastel 
-Vendorl ( ) 
-Vendor ( ) 
-MyBaseO 

The original library functions A( ) and B( ) still work the same (assuming the new v( ) calls itf 
base-class version). The destructor is now virtual and exhibits the correct behavior. 

Although this is a messy example, it does occur in practice and it's a good demonstration of 
where multiple inheritance is clearly necessary: You must be able to upcast to both base 
classes. 



Summary 



DDl in gtln Mil 
tliiUt tliH Htrnd 



lln -Jill .Hi" ifPt'i i. M'"l"i liiindiJ lii liiil( sli p It |> Hi < ■ i 1 ii m lin I 
lilt ling lipituM in hit iliisti i lil h [t sn It t d |. H i liii nnl ippti'i' li^i )»' « <>l 
jtil « iti lit piMti I tl hplicilt iniiniijttls III inliHiciii i iili lU lit t liiit t. I 1 i> i o I 

clfltitil. 

M illipli inhiilintt liii titn (iliti lit ■jnli ol Hi H 'i' .^^ This seems appropriate because 
like a goto, MI is best avoided in normal programming, but can occasionally be very useful. 
It's a "minor" but more advanced feature of C++, designed to solve problems that arise in 
special situations. If you find yourself using it often, you may want to take a look at your 
reasoning. A good Occam's Razor is to ask, "Must I upcast to all of the base classes?" If not, 
your life will be easier if you embed instances of all the classes you don 'I need to upcast to. 



A phrase coined by Zack Uriocker. 
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Exercises 



These exercises will take you slep-by-step tlirough tlie traps of MI. Create a 
base class X with a single constructor that takes an int argument and a 
member function f( ), that takes no arguments and returns void. Now inherit 
X into Y and Z, creating constructors for each of them that takes a single 
iDt argument. Now multiply inherit Y and Z into A. Create an object of 
class A, and call f( ) for that object. Fix the problem with explicit 
disambiguation. 

Starling with the results of exercise I, create a pointer to an X called px, 
and assign to it the address of the object of type A you created before. Fix 
the problem using a virtual base class. Now fix X so you no longer have to 
call the constructor for X inside A. 

Starling with the results of exercise 2, remove the explicit disambiguation 
for f( ), and see if you can call f( ) through px. Trace it to see which 
function gets called. Fix the problem so the correct function will be called 
in a class hierarchy. 
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7: Exception 
handling 



Improved error recovery is one of the most powerful ways 
you can increase the robustness of your code. 

Unfortunately, it's almost accepted practice to ignore error conditions, as if we're in a state of 
denial about errors. Some of the reason is no doubl the lediousness and code bloat of checking 
for many errors. For example, prinlf( ) returns the number of characters that were 
successfully printed, but virtually no one checks this value. The proliferation of code alone 
would be disgusting, not to mention the difficulty it would add in reading the code. 

The problem with C's approach to error handling could be thought of as one of coupling - the 
user of a function must tie the error-handling code so closely to that function that it becomes 
too ungainly and awkward to use. 

One of the major features m C++ is exception handling, which is a better way of thinking 
about and handling errors. With exception handling, 

1. Error-handling code is not nearly so tedious to write, and it doesn't become 
mised up with your "normal" code. You write the code you wanito happen; 
later m a separate section you write the code to cope with the problems. If you 
make multiple calls to a function, you handle the errors from that function once, 
inonepl.ce. 

2. Errors cannot be ignored. If a function needs to send an error message to the 
caller of that function, it "throws" an object representing that error out of the 
function. If the caller doesn't "catch" the error and handle it, it goes to the next 
enclosing scope, and so on until someone catches the error. 

This chapter examines C's approach to error handling (such as it is), why it did not work very 
well forC, and why it won't work at all for C-I-+. Then you'll leam about Iry, throw, and 
catch, the C++ keywords that support exception handling. 



Error handling in C 



li a il 1 1 Ik 1 1 1 1 J f l( ! ii ;l ii t !i [I L assert( ) was used as it was intended: for debugging 
during development with code that could be disabled with ttdeflne NDEBUG for the shipping 



product. Runtime error checking uses the require.h functions developed in Chapter XX. 
These were a convenient way to say, "There's a problem here you'll probably want to handle 
with some more sophisticated code, but you don't need to be distracted by it in this example." 
The require.h functions may be enough for small programs, but for complicated products you 
may need to write more sophisticated error-handling code. 

Error handling is quite straightforward in situations where you check some condition and you 
know exactly what to do because you have all the necessary information in that context. Of 
course, you just handle the error at that point. These are ordinary errors and not the subject of 
this chapter. 

The problem occurs when you don '! have enough information in that context, and you need to 
pass the error information into a larger context where that information does exist. There are 
three typical approaches in C to handle this situation. 

1. Return error information from tlie function or, if the return value cannot be 
used this way, set a global error condition flag. (Standard C provides eimo 
and perror( ) to support this.) As mentioned before, the programmer may 
simply ignore the error information because tedious and obfuscating error 
checking must occur with each function call. In addition, returning from a 
fiinction that hits an exceptional condition may not make sense. 

2. Use the little-known Standard C library signal-handling system, 
implemented with the signal( ) function (to determine what happens when 
the event occurs) and raise( ) (to generate an event). Again, this has high 
coupling because it requires the user of any library that generates signals to 
understand and install the appropriate signal -hand ling mechanism; also in 
large projects the signal numbers from different libraries may clash with 
each other. 

3. Use the nonlocal goto functions in the Standard C library: setjinp( ) and 
longjinp( ). With setjinp( ) you save a known good state in the program, 
and if you get into trouble, longjinp( ) will restore that state. Again, there is 
high coupling between the place where the state is stored and the place 
where the error occurs. 

When considering error-handling schemes with C-H-, there's an additional very critical 
problem: The C techniques of signals and setjmp/longjmp do not call destructors, so objects 
aren't properly cleaned up. This makes it virtually impossible to effectively recover from an 
exceptional condition because you'll always leave objects behind that haven't been cleaned 
up and that can no longer be accessed. The following example demonstrates this with 
setjmp/longjmp: 

1 // : CQ7 : Nonlocal . cpp 
// setjmpO fi longjmpO 
#include <io3tream> 

ttinclude <csetjmp> 
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class Rainbow { 

public: 

Rainbow 1) { cout << "Rainbow!)" < 
-Rainbow 1) { cout << "-Rainbow!)" 



old oz I) { 
Rainbow rb; 
for lint i = 0; i < 3; i + +) 

cout « "there's no place like home\n" 
long jmp !kansas, 4 7 ) ; 



in!) { 

etjmp Ikansas) == 0) { 



"I had 
endl ; 



setjmpO is an odd function because if you call it directly, it stores all the relevant 
information about the current processor stale in the jmp_buf and returns zero. In that case it 
has the behavior of an ordinary function. However, if you call lon^nip( ) using the same 
jinp_buf, it's as if you're returning from setjinp( ) again - you pop right out the back end of 
the setjnip( ). This time, the value returned is the second argument to lon^nip( ), so you can 
detect that you're actually coming back from a longjnip( ). You can imagine that with many 
different jnip_bnfs, you could pop around to many different places in the program. The 
difference between a local goto (with a label) and this nonlocal goto is that you can go 
aiiywherir with setjmp/longjmp {with some restrictions not discussed here). 
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The problem with C++ is that loiigjinp( ) doesn't respect objects; in particular it doesn't cal 
destructors when il jumps out of a scope, --^ Destructor calls are essential, so this approach 
won't work with C++. 

Throwing an exception 



looks like: 



i.{ ihti.fM 



throw myerror ("something bad happened"); 
myerror is an ordinary class, which takes a char* as its argument. You can use any type 
when you throw (includmg built-in types), but often you'll use special types created jusi for 
throwing exceptions. 

The keyword Ihrow causes a number of relatively magical things to happen. First it creates an 
object that isn't there under normal program execution, and of course the constructor is called 
for that object. Then the object is, in effect, "returned" from the function, even though that 
object type isn't normally what the function is designed to return. A simplistic way to think 
about exception handling is as an alternate return mechanism, although you get into trouble if 
you take the analogy too far - you can also exit from ordinary scopes by throwing an 
exception. But a value is returned, and the function or scope exits. 

Any similarity to fiinction returns ends there because where you return to is someplace 
completely different than for a normal fiinction call. (You end up in an appropriate exception 
handler that may be miles away from where the exception was thrown.) In addition, only 
objects that were successfully created at the time of the exception are destroyed (unlike a 
normal fiinction return that assumes all the objects in the scope must be destroyed). Of course, 
the exception object itself is also properly cleaned up at the appropriate point. 

In addition, you can throw as many different types of objects as you want. Typically, you'll 
throw a different type for each different type of error. The idea is to store the information in 
the object and the lype of object, so someone in the bigger context can figure out what to do 
with your exception. 



You may be surprised when you niji (he esajiiple - sojTie C++ cojTipileis have extended 
ongjmp( ) to cleaji up objects on Ihe stack. This is nonportable behavior. 
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Catching an exception 



If a function throws an exception, it must assume that exception is caught and dealt with. As 
mentioned before, one of the advantages of C++ exception handling is that it allows you to 

n the problem you're actually trying to solve in one place, and then deal with the 
s from that code in another place. 



The try block 



li VM'tt iiisidf I fodicn ind yon ill id w in citeplion (at i nlled Innction llitos' s in 
etteption), Iht fmclioi will Eul in Ihe process ol llirow inj. If ;o(i doi'l winl I throtr to 
leave a function, you can set up a special block within the function where you fry to solve 
your actual programming problem (and potentially generate exceptions). This is called the try 
block because you try your various function calls there. The try block is an ordinary scope, 
preceded by the keyword try: 

ry { 



If you were carefully checking for errors without using exception handling, you'd have to 
surround every function call with setup and test code, even if you call the same function 
several times. With exception handling, you put everything in a try block without error 
checking. This means your code is a lot easier to write and easier to read because the goal of 
the code is not confused with the error checking. 



Exception handlers 



Ofcojne.llie tliriniii eictplion mjsttinl iip soittplict.Itis is [lie exception handler, and 
there's one for every exception type you want to catch. Exception handlers immediately 
follow the try block and are denoted by the keyword catch: 



y { 






code that r 


ay gen 


erate excep 


catch (typel 


idl) 1 




// handle e 


ceptio 


ns of typel 


catch (type2 


id2) { 




// handle e 


ceptio 


ns of type2 



Each catch clause (exception handler) is like a little function that takes a single argument of 
one particular type. The identifier (idl, idl, and so on) may be used inside the handler, just 
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like a function argument, altliougli sometimes tliere is no identifier because it's not needed in 
the handler - the exception type gives you enough information to deal with it. 

The handlers must appear directly after the try block. If an exception is thrown, the exception- 
handling mechanism goes hunting for the first handler with an argument that matches the type 
of the exception. Then it enters that catch clause, and the exception is considered handled. 
(The search for handlers stops once the catch clause is finished.) Only the matching catch 
clause executes; it's not like a switch statement where you need a break after each case to 
prevent the remaining ones from executing. 

Notice that, within the try block, a number of different function calls might geneiate the same 
exception, but you only need one handler. 



Termination vs. resumption 



n (which is what C++ 

supports) you assume the error is so critical there's no way to get back to where the exception 
occurred. Whoever threw the exception decided there was no way to salvage the situation, and 
they don't want to come back. 

The alternative is called resumption. It means the exception handler is expected to do 
something to rectify the situation, and then the faulting function is retried, presuming success 
the second time. If you want resumption, you still hope to continue execution after the 
exception is handled, so your exception is more like a function call - which is how you should 
set up situations in C++ where you want resumption-like behavior (that is, don't throw an 
exception; call a function that fixes the problem). Alternatively, place your try block inside a 
while loop that keeps reentering the try block until the result is satisfactory. 

Historically, programmers using operating systems that supported resumptive exception 
handling eventually ended up using termination-like code and skipping resumption. So 
although resunq>tion sounds attractive at first, it seems it isn't quite so useful in practice. One 
reason may be the distance that can occur between the exception and its handler; it's one thing 
to terminate to a handler that's far away, but to jump to that handler and then back again may 
be too conceptually difficult for large systems where the exception can be generated from 
many points. 



The exception specification 



rhiov . H c«eve[, lliis is ic n side red verf ii n nvilizcd bit use it id ems lie cinnol be sure wiiir 
code 10 write Id (itcli ill p o leiliil etteiitioiis. I ce ii rse, ilb e li is y o n r son rce cede, lie on 
li II n t Ihio D ; li iiid lo o k [c r throw statements, but very often a library doesn't come with 
sources. C++ provides a syntax to allow you to politely tell the user what exceptions this 
function throws, so the user may handle them. This is the exception specification and it's part 
of the function declaration, appearing after the argument list. 
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The exception specification reuses the keyword throw, followed by a parenthesized list of all 
the potential exception types. So your function declaration may look like 

I void f 1 ) throw (toobig, toosmall, divzero) ; 
With exceptions, the traditional function declaration 

I void f 1) ; 
means that any type of exception may be thrown from the function. If you say 

I void f 1) throw 1) ; 
il means that no exceptions are thrown from a function. 

For good coding policy, good documentation, and ease-of-use for the function caller, you 
should always use an exception specification when you write a function that throws 
exceptions. 

unexpected( ) 

If your exception specification claims you're going to throw a certain set of exceptions and 
then you throw something that isn't in that set, what's the penalty? The special function 
Dnexpected() is called when you throw something other than what appears in the exception 
specification. 



set_unexpected{ ) 



nnespecltdl ) is implemented with a pointer to a function, so you can change its behavior. 
You do so with a function called sel_iinespected( ) which, like set_new_handler( ), takes 
the address of a function with no arguments and void return value. Also, it returns the 
previous value of the unexpected( ) pointer so you can save it and restore it later. To use 
set_unexpected( ), you must include the header file <exception>. Here's an example that 
shows a simple use of all the features discussed so far in the chapter: 



// 


C07 [Except .cpp 


// 


Basic exception 


// 


Exception speci 


#i 


iclude <exceptio 


#i 


iclude <iostream 


#i 


iclude <cstdlib> 


#i 


iclude <cstring> 
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switch (i) { 

case 1 : throw Up 1 ) ; 
case 2: throw Fit () ; 



// void gl) {1 // Version 1 
void gl) { throw 4 7; 1 // Vers 
// (Can throw built-in types) 



id 


my_unexpected ( ) 


{ 


COL 


t << "unexpected excep 


ex 


t 1 1 ) ; 




t n 


ainl) { 




set 


_unexpected (my_i. 


nexpec 


// 


(ignores return 


value) 


foi 


(int i = 1; i < 


3; i + + 




ry 1 
f (i) ; 
catch (Up) { 






cout « "Up cai 


ght" < 




catch (Fit) { 






cout « "Fit caught" 



The classes Up and Fit are created solely to throw as exceptions. Often exception classes will 
be this small, but sometimes they contain additional information so that the handlers can 
query them. 

f() is a function that promises in its exception specification to throw only exceptions of type 
Up and Fit, and from looking at the function definition this seems plausible. Version one of 
g( ), called by f( ), doesn't throw any exceptions so this is true. But then someone changes g( ) 
so it throws exceptions and the new g( ) is linked in with f( ). Now f( ) begins to throw a new 
exception, unbeknown to the creator of f( ). Thus the exception specification is violated. 

The niy_unexpected( ) function has no arguments or return value, following the proper form 
for a custom unexpected( ) function. It simply prints a message so you can see it has been 
called, then exits the program. Your new unexpected( ) function must not return (that is, you 
can write the code that way but it's an error). However, it can throw another exception (you 
can even rethrow the same exception), or call eKit( ) or abort( ). If iuiexpecled( ) throws an 
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exception, the search for the handler starts at the function call that threw the unexpected 
exception. (This behavior is unique to unexpected( ).) 

Although the new_handler( ) function pointer can be null and the system will do sojnething 
sensible, the unexpected( ) function pointer should never be null. The default value is 
tenninale( ) (mentioned later), but whenever you use exceptions and specifications you 
should write your own unexpected( ) to log the error and either rethrow it, throw something 
new, or terminate the program. 

In iiiain( ), the try block is within a for loop so all the possibilities are exercised. Note that 
this is a way to achieve something like resumption - nest the Iry block inside a for, while, do, 
or if and cause any exceptions to attempt to repair the problem; then attempt the try block 

Only the Up and Fit exceptions are caught because those are the only ones the programmer of 
f{ ) said would be thrown. Version two of g( ) causes iny_unexpected( ) to be called because 
f( ) then throws an int. (You can throw any type, including a built-in type.) 

In the call to set_unexpected( ), the return value is ignored, but it can also be saved in a 
pointer to function and restored later. 

Better exception specifications? 

y (11 It I J feel lie eiiiliif (ictptioi sf ecificitio ii n le i irt ii 'i i ety sde, iid 111 it 

I void f 1) ; 

should mean that no exceptions are thrown from this function. If the programmer wants to 
throw any type of exception, you may think he or she should have to say 

I void f() throw(...); // Not in C++ 

This would surely be an improvement because function declarations would be more explicit. 
Unfortunately you can't always know by looking at the code in a fiinction whether an 
exception will be thrown - it could happen because of a memory allocation, for example. 
Worse, existing fiinctions written before exception handling was introduced may find 
themselves inadvertently throwing exceptions because of the functions they call (which may 
be linked into new, except ion -throwing versions). Thus, the ambiguity, so 

I void f 1) ; 

means "Maybe I'll throw an exception, maybe 1 won't," This ambiguity is necessary to avoid 
hindering code evolution. 



Catching any exception 



A s Ji tn tin D (J . if villi [ fiiii clio ii in lo (* cef lio d ip k ificiiio n , any type of exception can be 
thrown. One solution to this problem is to create a handler that catches any type of exceptio 
You do this using the ellipses in the argument list (a la C): 
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This will catch any exception, so you'll want to put it al the end of your list of handlers to 
avoid pre-empting any that follow it. 

The ellipses give you no possibility to have an argument or to know anything about the type 
of the exception. It's a catch-all. 

Rethrowing an exception 

Son e lira ei yo ii 'II w ill lo teHioi Ike ticeptloa 111 it j oil juil t n jli 1, p nlicn Inl) ft en yai 
DH llic elllrses to cilcli iny eicepMon became tbcre'i no in loira illo n ivillible ibonl tbt 
etteption.Tliisis iccora|iliilied by siyii; throw with no argument: 

3tch(...) { 



Any further catch clauses for the same try block are still ignored - the thtx»w causes the 
exception to go to the exception handlers in the next-higher context. In addition, everything 
about the exception object is preserved, so the handler at the higher context that catches the 
specific exception type is able to extract all the information from that object. 



Uncaught exceptions 



II none ol tbt titeption Iiindleis lollofing i pirticnlit try block matches an exception, that 
exception moves lo the next-higher context, that is, the function or try block surrounding the 
try block that failed lo catch the exception. (The location of this higher-context try block is 
not always obvious at first glance.) This process continues until, at some level, a handler 
matches the exception. At that point, the exception is considered "caught," and no further 
searching occurs. 

If no handler at any level catches the exception, it is "uncaughf or "unhandled." An uncaught 
exception also occurs if a new exception is thrown before an existing exception reaches its 
handler — the most common reason for this is that the constructor for the exception object 

itself causes a new exception. 

terminate{ ) 

If an exception is uncaught, the special function lerm inale( ) is automatically called. Like 
unexpected( ), terminate is actually a pointer to a function. Its default value is the Standard C 
library function abort( ), which immediately exits the program with no calls to the normal 
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11 functions (which means that destructors for global and static objects might not be 
called). 

No cleanups occur for an unc aught exception; that is, no destructors are called. If you don't 
wrap your code (including, if necessary, all the code inmainO) in a try block followed by 
handlers and ending with a default handler (catch(...)) to catch all exceptions, then you will 
take your lumps. An uncaught exception should be thought of as a programming error. 

set_teriiiinate( ) 

You can install your own term in ate ( ) function using the standard set_teniiinate( ) function, 
which returns a pointer to the temiinate( ) function you are replacing, so you can restore it 
later if you want. Your custom temiinate() must take no arguments and have a void return 
value. In addition, any temiinate( ) handler you install must not return or throw an exception, 
but instead must call some sort of program-termination function. If tenninate( ) is called, it 
means the problem is unrecoverable. 

Like onexpected( ), the tenninate( ) function pointer should never be null. 

Here's an example showing the useof set_teniiinate(). Here, the return value is saved and 
restored so the temiinate( ) function can be used to help isolate the section of code where the 
uncaught exception is occurring: 



// 


C07 [Terminator 


cpp 


// 


Use of set_term 


mate 


// 


Also 


hows unca 


jght 


#i 


nclude 


<exceptio 




#i 


nclude 


<iostream 




#i 


nclude 


<cstdlib> 




us 


mg na 


nespace std; 


vo 


id ter 


ninator () 






=out « "I'll be 


back 



void C-old.terminate) () 

= set_terniinate (terminator) ; 

class Botch { 
public: 

void f 1) { 

cout << "Botch: :fl)" << endl ; 

throw Fruit 0; 
} 
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mainl) { 
ry{ 
Botch b; 
b . f ( ) ; 

catch (...) { 



} ///:- 

The definition of old_temiinate looks a bit confusing at first: It not only creates a pointer to a 
function, but it initializes tlial pointer to the return value of set_teiininate( ). Even thougti 
you may be familiar with seeing a semicolon right after a poinier-to-fiinction definition, it's 
just another kind of variable and maybe initialized when it is defined. 

The class Botch not only throws an exception inside f( ), but also in its destructor. This is one 
of the situations that causes a call to temiiiuite( ), as you can see in niain( ). Even though the 
exception handler says catch(...), which would seem to catch everything and leave no cause 
for temiinate( ) to be called, terminafe( ) is called anyway, because in the process of 
cleaning up the objects on the slack to handle one exception, the Botch destructor is called, 
and that generates a second exception, forcing a call to temiinate( ). Thus, a destructor that 
throws an exception or causes one to be thrown is a design error. 



Function -lev el try blocks 

// : CO 7 :FunctionTryBlock.cpp 
// Function-level try blocks 
#include <iostream> 



int mainl) try { 

} catch (const char^ msg) { 

} ///:- 



Cleaning up 
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cleaned up properly as the exception was thrown. C++ exception handling guar „., 

you leave a scope, all objects in that scope whose constructors have been completed will have 
destructors called. 

Here's an example that demonstrates thai constructors that aren't completed don't have the 
associated destructors called. It also shows what happens when an exception is thrown in the 
middle of the creation of an array of objects, and an unexpected( ) function that rethrows the 
unexpected exception: 

// : COT : Cleanup. cpp 

// Exceptions clean up objects 

linclude <fstream> 

# include <exception> 

linclude <cstring> 

using namespace std; 

int objnum; 

char name[sz] ; 
public: 

Noisy (const char* nm="array elem" ) throw (int) { 

objnum = i++; 

memset(name, 0, sz); 

strncpy(name, nm, sz - 1); 

« '■ name [ " « name « " ] " « endl ; 
if (objnum == 5) throw int(5); 

// Not in exception specification: 



objnum 
:< endl; 



,id* operator new[](size_t sz) ! 
out << "Noisy : :new[] " << endl; 
return : :new char[sz]; 

lid operator delete[] (void* p) { 
out « "Noisy: :delete[] " « endl 
::delete []p; 
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pected_rethrowl) { 

"inside unexpected_rethrow () " 
// Rethrow same exception 



int mainl) { 

set_unexpected (iinexpected_rethrow) ; 
try I 

Noisy nl ("before array"); 

// Throws exception: 

Noisy* array = new Noisy[7]; 

Noisy n2 ("after array"); 
1 catch (int i) { 

out « "caught " « i « endl ; 
1 

out << "testing unexpected:" << endl; 
try I 

Noisy n3 ("before unexpected"); 

Noisy n4 ("z") ; 

Noisy n5 ("after unexpected"); 
1 catch (char c) { 

out « "caught " « c « endl; 
1 
( III-." 

The class Noisy keeps track of objecis so you can trace program progress. It keeps a count of 
the number of objects created with a static data member i, and the number of the particular 
object with objnum, and a character buffer called name to hold an identifier. This buffer is 
fu'st set to zeroes. Then the constructor argument is copied in. (Note that a default argument 
string is used to indicate array elements, so this constructor also acts as a default constructor.) 
Because the Standard C library function strncpy( )stops copying after a null terminator or the 
number of characters specified by its third argument, the number of characters copied in is 
one minus the size of the buffer, so the last character is always zero, and a prmt statement will 
never run off the end of the buffer. 

There are two cases where a throw can occur m the constructor. The first case happens if this 
is the fifth object created (not a real exception condition, but demonstrates an exception 
thrown during array construction). The type thrown is int, which is the type promised in the 
exception specification. The second case, also contrived, happens if the first character of the 
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argument string is 'z', in which case a char is thrown. Because char is not listed in the 
exception specification, this will cause a call to unexpected( ). 

The array versions of new and delete are overloaded for the class, so you can see when 
they're called. 

The function unexpected_rethrow( ) prints a message and rethrows the same exception. It is 
installed as the unexpected( ) function in the first line of niain( ). Then some objects of type 
Noisy are created in a try block, but the array causes an exception to be thrown, so the object 
n2 is never created. You can see the results in the output of the program: 

onstructing Noisy name [before array] 

oisy::new(] 

onstructing Noisy 1 name [array elem] 

onstructing Noisy 2 name [array elem] 

onstructing Noisy 3 name [array elem] 

instructing Noisy 4 name [array elem] 

instructing Noisy 5 name [array elem] 

estructing Noisy 3 name [array elem] 

estructing Noisy 2 name [array elem] 

estructing Noisy 1 name [array elem] 

oisy: :delete[] 

estructing Noisy name [before array] 

aught 5 

esting unexpected : 

Dnstructing Noisy 6 name [before unexpected] 

Dnstructing Noisy 7 name [z] 

nside unexpected_rethrow ( ) 

estructing Noisy 6 name [before unexpected] 

aught z 

Four array elements are successfully created, but in the middle of the constructor for the fifth 
one, an exception is thrown. Because the fifth constructor never completes, only the 
destructors for objects 1^ are called. 

The storage for the array is allocated separately with a single call to the global new. Notice 
that even though delete is never explicitly called anywhere in the program, the exception- 
handling system knows it must call delete to properly release the storage. This behavior 
happens only with "normal" versions of operator new. If you use the placement syntax 
described in Chapter XX, the exception -handling mechanism will not call delete for that 
object because then it might release memory that was not allocated on the heap. 

Finally, object nl is destroyed, but not object n2 because it was never created. 

In the section testing unexpected_rethrow( ), the n3 object is created, and the constructor of 
n4 is begun. But before it can complete, an exception is thrown. This exception is of type 
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char, which violates the exception specification, so the une\pected( ) function is called 
(which is unexpected_rethrow( ), in this case). This rethrows the same exception, which is 
expected this time, because unexpected_rethrow( ) can throw any type of exception. The 
search begins right after the constructor for n4, and the char exception handler catches it 
(after destroying n3, the only successfully created object). Thus, the effect of 
unexpected_rethro\*( ) is to take any unexpected exception and make it expected; used this 
way it provides a filter to allow you to track the appearance of unexpected exceptions and 
pass them through. 



Constructors 



I 111 Ktili.f cHi . ill ewipli.ii.il'i p>[liiilnli ii pnliiillii 

II iKCplill KCHl, I ill tlii hi p[opiilv cliiid ip')' U tit of Ihi 
ii (giilriiUgii llni'i i pigHti : II n iicipiioi \i llrii i hint i 
lit iMtdilid dist[ittoi t ill nt \i Miltd fg[ Ihl ghttt. 1 Us i ti 



i kt nligs,'lf 
gf iiiiipltltJ, 



liliitil) III 
lit itiiiH 



I"! ) 



er Mi'ltri.f.feiiMlt 



// : C07 :Nudep.cpp 
// Naked pointers 

linclude <cstdlib> 
using namespace std; 



public: 

Cat 1) { ou 
-Catl) { o 



"Cat ()'■ « endl; ] 
'■-Cat!)" « endl; 



class Dog { 
public: 

out << "allocating a Doi 
throw int (47) ; 



} 



oid operator delete (void*" p) { 
out << "deallocating a Dog" << endl; 
: :delete p; 
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}; 



Cat* bp; 
Dog* op; 
public: 

UseResources (iiit count = 1) { 

out << "UseResources 1) " << endl; 

bp = new Cat[count]; 

op = new Dog; 
} 

out << "-UseResources 1 ) " << endl; 
delete [ ] bp ; // Array delete 
delete op; 



mainl) { 
ry ! 
UseResources url3); 

catch lint) { 

out « "inside handle 



The output is the folio wmg: 



Us eRe s i 

Cat 
Cat 1) 



The UseResources constructor is entered, and the Cat constructor is successfully completed 
for the array objects. However, inside Dog::operator new, an exception is thrown (as an 
example of an out-of-memory error). Suddenly, you end up inside the handler, without the 
UseResources destructor being called. This is correct because the UseResources constructor 
was unable to finish, but it means the Cat object that was successfully created on the heap is 
never destroyed. 
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Making everything an object 



To prevent this, guard against these "raw" resource allocations by placing the allocations 
inside their own objects with their own constructors and destructors. This way, each allocation 
becomes atomic, as an object, and if it fails, the other resource allocation objects are properly 
cleaned up. Templates are an excellent way to modify the above example: 

// : CO 7 [Wrapped. cpp 
// Safe, atomic pointers 
#include <fstream> 
#include <cstdlib> 

using namespace std; 



// Simplified. Yours may have other argument 
template<class T, int sz = 1> class PWrap { 

T* ptr; 
public: 

class RangeError {); // Exception class 
PWrap 1 ) { 

ptr = new T[sz] ; 

out << "PWrap constructor" << endl ; 
1 
-PWrap 1) { 

delete []ptr; 

out << "PWrap destructor" << endl ; 
} 
TS operator [] lint i) throw (RangeError ) { 

ifli >= fifi i < sz) return ptr[i]; 

throw RangeError () ; 



public: 




Cat 1) { out « 


Cat 1) " « endl; 1 


-Cat 1) { out « 


"-Cat 1) " « endl; 


void gl) {) 




class Dog { 




public: 




void* operator 


iew[] lsize_t sz) ! 
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throw int (47) ; 
oid operator delete[] (void* p) { 
: :delete p ; 



}; 



class UseResources { 

PWrap<Cat, 3> Bonk; 

PWrap<Dog> Og ; 
public: 

UseResources 1) : Bonk ( ) , Og ( ) { 
out << "UseResources 1 ) " << end 

} 

out « "-UseResources!)" « en 
1 
void f 1) { Bonk[l] .g(); ] 



The difference is the use of the template to wrap the pointers and make them into objects. The 
constructors for these objects are called before the body of the UseResources constructor, and 
any of these constructors that complete before an exception is thrown will have their 
associated destructors called. 

The PWrap template shows a more typicaluseof exceptions than you've seen so far: A 
nested class called RangeError is created to use in operalor[ ] if its argument is out of range. 
Because operator[ ] returns a reference it cannot return zero. (There are no null references.) 
This is a true exceptional condition - you don't know what to do in the current context, and 
you can't return an improbable value. In this example, RangeError is very simple and 
assumes all the necessary information is in the class name, but you may also want to add a 
member that contains the value of the index, if that is useful. 
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Catl) 

CatO 

CatO 

PWrap constructor 

allocating a Dog 

-Cat 

-Cat 1) 

-Cat 1) 

PWrap destructor 

inside handler 

Again, ihe storage allocation for Dog throws an exception, but this time the array of Cat 
objects is properly cleaned up, so there is no memory leak. 



Exception matching 
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// : CO? :Autoexcp.cpp 
// No matching conversions 
#include <iostream> 
using namespace std; 

class Exceptl {1; 
class Except2 { 
public: 

Except2 (ExceptlS) { 1 

}; 

void fl) { throw Exceptl 1); 
int mainl) { 
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y { f 1) ; 






catch (Except2) | 






cout « "inside catch (Ex 


cept2) ' 


« endl; 


catch (Exceptl) { 






cout « "inside catch (Ex 


ceptl) ' 


« endl; 



Even though you might think the first handler could be used by converting an Excepll object 
into an Exceptl using the constructor conversion, the system will not perform such a 
conversion during exception handling, and you'll end up at the Exceptl handler. 

The following example shows how a base-class handler can catch a derived-class exception: 



// : CO? :Basexcpt .cpp 


// Excepti 


in hierarchies 


linclude <iostream> 


using name 


pace std; 


class X { 




public: 




class Tr 


iuble 1); 


class Sm 


ill : public Trouble { } ; 


class Big : public Trouble { } ; 


void f() 


1 throw BigO; ) 



try { 




x.f 0; 




) catch (X 


:Trouble) | 


cout « 


"caught Trouble" 


// Hidden 


by previous handl 


) catch (X 


: Small) { 


cout « 


"caught Small Tro 


) catch (X 


:Big) { 


cout « 


"caught Big Troub 



Here, the exception-handling mechanism will always match a Trouble object, or anything 
derived from Trouble, to the first handler. That means the second and third handlers are never 
called because the first one captures them all. It makes more sense to catch the derived types 
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first and put the base type at the end to catch anything less specific (or a derived class 
introduced later in the development cycle). 

In addition, if Small and Big represent larger objects than the base class Trouble (which is 
often true because you regularly add data members to derived classes), then those objects are 
sliced to fit into the first handler. Of course, in this example it isn't important because there 
are no additional members in the derived classes and there are no argument identifiers in the 
handlers anyway. You'll usually want to use reference arguments rather than objects in your 
handlers to avoid slicing off information. 



Standard exceptions 



Ih S tlldHJ [ , 



exception 


The base class for all the exceptions thrown 
by the C++ standard library. You can ask 
what< ) and get a result that can be 
displayed as a character representation. 


logic_error 


Derived from exception. Reports program 
logic errors, which could presumably be 
detected before the program executes. 


runtime_error 


Derived from exception. Reports runtime 
errors, which can presumably be detected 
only when the program executes. 



::failure is also derived from exception, but it has n^ 



The iostream exception class i< 
subclasses. 

The classes in both of the following tables can be used as they a 
classes to derive your own more specific types of exceptions. 



Exception classes derived from logic_error 


domain_error 


Rejots olations of a precondition. 


in va lid_a rgument 


I dica es a invalid argument to the 
fi lo t s thrown from. 


length_error 


1 d ci es a attempt to produce an object 

1 ose le gth is greater than or equal to 
NPOS (the largest representable value of 
type size_t). 
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Exception classes derived from logic_error 


out_of_ range 


Reports an out-of-range argument. 


bad_cast 


Thrown for executing an invalid 
dynamic_cast expression in run-time 
type identification (see Chapter XX). 


bad_tjpeid 


Reports a null pointer p in an expression 
typeid(*p). (Again, a run-time type 
identification feature in Chapter XX). 



Exception classes derived from nintime_error 


range_error 


Reports violation of a postcondition. 


overflow_error 


Reports an arithmetic overflow. 


bad_alloc 


Reports a failure to allocate storage. 



Programming with exceptions 



When to avoid exceptions 



e iio( warranted. 



Not for asynchronous events 



Tlie SUndard C signal() system, and any similar system, handles asynchronous events: 
events that happen outside the scope of the program, and thus events the program cannot 
anticipate. C++ exceptions cannot be used lo handle asynchronous events because the 
exception and its handler are on the same call stack. That is, exceptions rely on scoping, 
whereas asynchronous events must be handled by completely separate code that is not part of 
the normal program flow (typically, interrupt service routines or event loops). 

This is not to say that asynchronous events cannot be associated with exceptions. But the 
interrupt handler should do its job as quickly as possible and then return. Later, at some well- 
defined point in the program, an exception might be thrown based on the interrupt. 



Chapter 16: Exception HandUng 



Not for ordinary error conditions 



If you have enough information to handle an error, it's not an exception. You should take care 
of it in the current context rather than throwing an exception to a larger context. 

Also, C-H- exceptions are not thrown for machine-level events like divide-by-zero, it's 
assumed these are dealt with by some other mechanism, like the operating system or 
hardware. That way, C-H- exceptions can be reasonably efficient, and their use is isolated to 
program-level exceptional conditions. 

Not for flow -of- control 

An exception looks somewhal like an alternate return mechanism md nomewbal like a switch 
statement, so you can be tempted to use them for other than their original intent. This is a bad 
idea, partly because the exception-handling system is significantly less efficient than normal 
program execution; exceptions are a rare event, so the normal program shouldn't pay for 
them. Also, exceptions from anything other than error conditions are quite confusing to the 
user of your class or function. 



You're not forced to use exceptions 



assert( ) or to print a message and abort( ) the program, allowing the system to clean up the 
mess, rather than to work very hard to catch all exceptions and recover all the rt 
yourself. Basically, if you don't need to use exceptions, you don't have to. 



New exceptions, old code 



eseeptlons. You may introduce a library that does use exceptions and wonder if you need lo 
modify all your code throughout the program. Assuming you have an acceptable error- 
handling scheme already in place, the most sensible thing to do here is surround the largest 
block that uses the new library (this may be all the code in inain( )) with a try block, followed 
by a catcii(...) and basic error message. You can refine this to whatever degree necessary by 
adding more specific handlers, but, in any case, the code you're forced to add can be minimal. 

You can also isolate your except ion -generating code in a try block and write handlers to 
rt the exceptions into your existing error-handling scheme. 

'uly important to think about exceptions when you're creating a library for someone else 
i, and you can't know how they need to respond to critical error conditions. 



Typical uses of exceptions 
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4. Fix the problem and call the function (which caused the exception) again. 

5. Patch things up and continne without retrying the function. 

e5ult instead of what the function was supposed 

current conrext and retlirow tlie .Mmf exception 
current context and throw a rf(^ei-eii( exception 



10. Wrap functions (especially C library functions) that use ordinary error 
schemes so they produce exceptions instead, 

1 1. Simplify. If your exception scheme makes things more complicated, then it 
is painful and annoying to use. 

12. Make your library and program safer. Tliis is a short-term investment (for 
debugging) and a long-term investment (for application robustness). 



6. 


Calculate some alte 




to produce. 


7. 


Do whatever you cs 




to a higher context. 


8. 


Do whatever you cs 




to a higher context. 


9. 


Terminate the progi 



Always use exception specifications 

;eplion specification is like a function prototype: It tells the us 



handlin 

come outof this function. 

a particular function. Soni etim es the functions it calls produce an unexpecti 

sometimes an old function that didn't throw an exception ii replaced with a 

does, and you'll get a call to unexpected{ ). Anytime you use exception specifici 

functions that do, you should create your own unexpected( ) function that logs a message and 

rethrows the same exception. 

Start witli standard exceptions 

Check out the Standard C +4 library exceptions before creating your own, if a standard 

handle. 

If the exception type you w ant isn't part of the standard library, try to derive one from an 
existing standard exception. It's nice for your users if they can always write their code to 
expect the what( ) function defined in the exception( ) class interface. 



Chapter 16: Exception HandUng 



Nest your own exceptions 



If you create exceptions for your particular class, it's a very good idea to nest the exception 
classes inside your class to provide a clear message to the reader that this exception is used 
only for your class. In addition, it prevents the pollution of the namespace. 

You can nest your exceptions even if you're deriving them from C++ standard exceptions. 

Use exception liierarcliies 

Exception hierarchies provide a valuable w ay lo classify the differenrtype? of criricd errors 
that may be encountered with your class or library. This gives helpful inform alio ii to users, 
assists them in organizing fheir code, and gives them the option of ignoring all the specific 
types of exceptions and just catching the base -class type. Also, any exceptions added later by 
inheriting from the sam e base class will not force all esisting code to be rewritten - the base- 
class handler w ill catch the new exception. 

Of course, the Standard Ct+ exceptions are a good example of an exception hierarchy, and 
one that you can use to build upon. 



Multiple inheritance 



You'll remember from Chapter XX that the only essentia/ place for MI is if you need to 
upcast a pointer to your object into two different base classes- that is, if you need 
polymorphic behavior with both of those base classes. It turns out that exception hierarchie 
are a useful place for multiple inheritance because a base-class handler from any of the root 
of the multiply inherited exception class can handle the exception. 



Catch by reference, not by value 



If you throw an object of a derived class and it is caught by value in a handler for an object of 
the base class, that object is "sliced" - that is, the derived-class elements are cut off and you'll 
end up with the base-class object being passed. Chances are this is not what you want because 
the object will behave like a base-class object and not the derived class object it really is (or 
rather, was - before it was sliced). Here's an example: 



// 


C07:Catchref 


cpp 


// 


Why catch 


by 


refe 


#i 


elude <io 


tre 





public: 

virtual void what () { 
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class Derived : public : 
public: 

void what ( ) { 

cout << "Derived" < 



id f 1) { throw Derived 1) ; 
t mainl) { 



1 catch (Base b) | 

b . what 1 ) ; 
1 
try { 

f 1); 

1 catch (Bases b) { 
b . what 1 ) ; 



The outpul is 



because, when the object is caught by value, it is turned into a Base object (by the copy- 
constructor) and must behave that way in all situations, whereas when it's caught by 
reference, only the address is passed and the object isn't truncated, so it behaves like what it 
really is, a Derived in this case. 

Although you can also throw and catch pointers, by doing so you introduce more coupling - 
the thrower and the catcher must agree on how the exception object is allocated and cleaned 
up. This is a problem because the exception itself may have occurred from heap exhaustion. If 
you throw exception objects, the except ion -hand ling system takes care of all storage. 



Throw exceptions in constructors 



mlocalflag and hope the user 
an incompletely created objec 
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This is a serious problem because C programmers have come to rely on an implied guarantee 
that object creation is always successful, which is not unreasonable in C where types are so 
primitive. But continuing execution after construction fails in a C++ program is a guaranteed 
disaster, so constructors are one of the most important places to throw exceptions - now you 
have a safe, effective way to handle constructor errors. However, you must also pay attention 
to pointers inside objects and the way cleanup occurs when an exception is thrown inside a 

Don't cause exceptions in destructors 

Because destructors are called in the process of throw ing other esceptions, you'll never w ant 
to throw an exception in a destructor or cause another eiceplion to be thrown by some action 
you perform in the destructor. If this happens, it means thai a new exception may be thrown 
before the catch-clause for an existing exception is reached, which will cause a call to 
terminate ( ). 

This means that if you call any functions inside a destructor that may throw exceptions, those 
calls should be within a try block in the destructor, and the destructor must handle all 
exceptions itself. None must escape from the destructor. 



Avoid naked pointers 



Wrapped.cpp. A naked pointer usually means vulnerability in the constructor if 
allocated for that pointer. A pointer doesn't have a destructor, so those i 
be released if an exception is thrown in the constructor. 



Overhead 



(I f ID lilt it tnii 10 1 iliiif \u\\\\ no [iiint; i tii 1 1 niirlioi ii Itrii d llini'i 
tDMidniHt [iilii i oittliiir T his \\\\t [inn {on liin i ill to nt n up lit is n pnl d 

lit ill DitH til) ij[ili, 10 lit gitiltid ii Hli' •> lit M(i|li«i II' 111 •• III I HI 111) 
ciidlii; CDlr (I It gf lit ii FDilnNtsiii pill lD[ iictplln hiMii; I I) till it (gilUi 
ii p It I t g ti d I ill I g ii p 1 1 1 g I 1 1 tr i lii i ;| 1 1 d i 1 1 1 1 1 wasn 'fused; that is, as long as you 
don't throw an exception, your code runs as fast as it would without exception handling. 
Whether or not this is actually true depends on the particular compiler implementation you're 

Exception handling also causes extra information to be put on the stack by the compiler, to aid 
in stack unwinding. 

Exception objects are properly passed around like any other objects, except that they can be 
passed into and out of what can be thought of as a special "exception scope" (which may just 
be the global scope). That's how they go from one place to another. When the exception 
handler is finished, the exception objects are properly destroyed. 
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Summary 



Error recovery is a fundamental concern for every program you write, and it's especially 
important in C++, where one of the goals is to create program components for others to use. 
To create a robust system, each component must be robust. 

The goals for exception handling in C++ are to simplify the creation of large, reliable 
programs using less code than currently possible, with more confidence that your application 
doesn't have an unhandled error. This is accomplished with little or no performance penalty, 
and with low impact on existing code. 

Basic exceptions are not terribly difficult to learn, and you should begin using them in your 
programs as soon as you can. Exceptions are one of those features that provide immediate and 



significant benefits to your project. 

Exercises 



1. Create a class with member functions that throw exceptions. Within this 
class, make a nested class to use as an exception object. It takes a single 
char* as its argument; this represents a description string. Create a member 
function that throws this exception. (State this in the function's exception 
specification.) Write a try block that calls this function and a catch clause 
that handles the exception by printing out its description string. 

2. Rewrite the Stash class from Chapter XX so it throws out-of-range 
exceptions for operator[]. 

3. W rite a generic nmin{ ) that takes all exceptions and reports them as errors. 

4. Create a class with its own operator new. This operator should allocate 10 
objects, and on the 11th "runout of memory" and throw an exception. Also 
add a static member function that reclaims this memory. Now create a 
inaiD( ) with a try block and a catch clause that calls the memory- 
restoration routine. Put these inside a while loop, to demonstrate recovering 
from an exception and continuing execution. 

5. Create a destructor that throws an exception, and write code to prove to 
yourself that this is a bad idea by showing that if a new exception is thrown 





before the handler for the esistin 


gone is reached, terminateO is ci 


6. 


Prove to yourself that all excepti 
properly destroyed. 


on objects (the ones that are throwi 


7. 


Prove to yourself that if you crea 


te an exception object on the heap 




throw the pointer to that object, i 


it will nof be cleaned up. 


8. 


(Advanced). Track the creation a 


ind passing of an exception using a 




with a constructor and copy-coni 


itructor that announce themselves a 




provide as much information as | 


lossible about how the object is be 
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created (and in the case of the copy -constructor, what object it's being 
created from). Set up an interesting situation, throw an object of your ne 
type, and analyze the result. 
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8 : Run-time type 
identification 



Run-time type identification (RTTI) lets you find the exact 
type of an object when you have only a pointer or reference 
to the base type. 

This can be thought of as a "secondary" feature in C++, a pragmatism to help out when you 
get into messy situations. Normally, you'll want to intentionally ignore the exact type of an 
object and let the virtual function mechanism implement the correct behavior for that type. 
But occasionally it's useful to know the exact type of an object for which you only have a 
base pointer. Often this information allows you to perform a special-case operation more 
efficiently or prevent a base-class interface ftxim becoming ungainly. It happens enough that 
most class libraries contain virtual functions to produce run-time type information. When 
exception handling was added to C++, it required the exact type information about objects. It 
became an easy next step to build access to that information into the language. 

This chapter explains what RTTI is for and how to use it. In addition, it explains the why and 
how of the new C++ cast syntax, which has the same appearance as RTTI. 

The "Shape" example 

[ li s I Shape, and the specific derived types are Circle, Square, and Triangle: 




This is a typical class-hierarcliy diagram, with the base class at the top and the derived classes 
growing downward. The normal goal in object-oriented programming is for the bulk of your 
code to manipulate pointers to the base type (Shape, in this case) so if you decide to extend 
the program by adding a new class (rfaomboid, derived from Shape, for example), the bulk of 
the code is not affected. In this example, the virtual function in the Shape interface is draw( ), 
so the intent is for the client programmer to call draw( ) through a generic Shape pointer. 
draw( ) is redefined in all the derived classes, and because it is a virtual function, the proper 
behavior will occur even though it is called through a generic Shape pointer. 

Thus, you generally create a specific object (Circle, Square, or Triangle), take its address 
and cast it to a Shape* (forgetting the specific type of the object), and use that anonymous 
pointer in the rest of the program. Historically, diagrams are drawn as seen above, so the act 
of casting from a more derived type to a base type is called upcasting. 



What is RTTI? 



1 VI ! hi .1 ) DD hit j ipnid p[g<tii II ii< {ttiUt Ih I'l i isii it li ihxi il tm hii lit 
tiKl l)|t ill [iinit (lisli'' f «' 111" pli, iippii! Jill ( III It illii pir iiiin 10 
hihiihilllh ;hpc> hii, pnlidlnin! h li ii ii { Ih i p Hf It ■ I 1 ii . i j , Il m n i (ih 
ill llMiinjlM n ll( iiKd k, lijUislliif Ihi .t MI MlHil (iiilippi.nl 1 n h 1. ir, 
I virti 1 1 In I lig D Nil TumColorlfYouAreA(), which allows enumerated arguments of 
some type color and of Shape::Circle, Shape: :Square, or Shape:: Triangle. 

To solve this sort of problem, most class library designers put virtual functions in the base 
class to return type information about the specific object at runtime. You may have seen 
library member functions with names like isA( ) and typeOf( ). These are vendor-defined 
RTTI functions. Using these functions, as you go through the list you can say, 'Tf you're a 
triangle, turn purple." 

When exception handling was added to C++, the implementation required that some run-time 
type information be put into the virtual function tables. This meant that with a small language 
extension the programmer could also get the run-time type information about an object. All 
library vendors were adding their own RTTI anyway, so it was included in the language. 

RTTI, like exceptions, depends on type information residing in the virtual function table. If 
you try to use RTTI on a class that has no virtual functions, you'll get unexpected results. 



Two syntaxes for RTTI 



Tlieti ire tw o JifleKil «■ iv s to o se 8 I T i. Th ( firu acts Ilk t sizeof() because it looks like a 
function, but it's actually implemented by the compiler. typeid( ) lakes an argument that's an 
object, a reference, or a pointer and returns a reference to a global const object of type 
typeinfo. These can be compared to each other with the operator== and operator!=, and you 
can also ask for the nanie( ) of the type, which returns a string representation of the type 
name. Note that if you hand typeid( ) a Shape*, it will say that the type is Shape*, so if you 



Chapter 17: Rim-Timc Type Idenlifia 



want to know the exact type it is pointing to, you must dereference the pointer For example, 
if s is a Shape*. 

I cout « typeidC-s) .name 1) « endl ; 

will print out the type of the object s points to. 

You can also ask a typeinfo object if it precedes another typeinfo object in the 
implementation -defined "collation sequence," using before(typeinfo&), which returns true or 
false. When you say, 

I if (typeidlme) .before (typeidlyou) ) ) // ... 

you're asking if me occurs before you in the collation sequence. 

The second syntax for RTTI is called a "type-safe downcast." The reason for the term 
"downcasf is (again) the historical arrangement of the class hierarchy diagram. If casting a 
Circle* to a Shape* is an upcast, then casting a Shape* to a Circle* is a downcast. However, 
you know a Circle* is also a Shape*, and the compiler freely allows an upcast assignment, but 
you don 'I know that a Shape* is necessarily a Circle*, so the compiler doesn't allow you to 
perform a downcast assignment without using an explicit cast. You can of course force your 
way through using ordinary C-style casts or a C++ static_cast (described at the end of this 
chapter), which says, "I hope this is actually a Circle*, and I'm going to pretend it is." 
Without some explicit knowledge that it is in fact a Circle, this is a totally dangerous thing to 
do. A common approach in vendor-defined RTTI is to create some fiinction that attempts to 
assign (for this example) a Shape* to a Circle*, checking the type in the process. If this 
fiinction returns the address, it was successful; if it returns null, you didn't have a Circle*. 

The C++ RTTI type safe -downcast follows this "attempt -to -cast" function form, but it uses 
(very logically) the template syntax to produce the special function dynaniic_cast. So the 
example becomes 



The template argument for dyiiainic_cast is the type you want the function to produce 
this is the return value for the function. The function argument is what you are trying tt 

Normally you might be hunting for one type (tria 
following example fragment can be used if you w 

rcle* cp = dynamic_cast<Ci 
[uare* sp = dynamic_cast<Si: 
"iangle* tp = dynamic_cast< 

this is contrived - you'd probably put a static data member in each type and 
it in the constructor. You would do something like that if you had control of the 



turn purple, for instan 


e), but the 


ount the number of var 


ous shapes 


>lsh) ; 




>lsh) ; 




gle*> (sh) ; 
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source code for the class and could change it. Here's an example that c 
both the static member approach and dj'naiiiic_cast: 

// : COB :Rtshapes . cpp 
// Counting shapes 
#include ". . /purge. h" 
#include <io3tream> 
linclude <ctime> 
linclude <typeinfo> 
#include <vector> 
using namespace std; 

class Shape | 
protected: 

public: 

Shape { count++; ) 

virtual -Shape 1) { count—; 1 

virtual void drawl) const = 0; 

static int quantity () { return count; ) 



class SRectangle : public Shape { 
void operator- (SRectangleS) ; // 
protected: 

public: 

SRectangle 1) { count++; ) 
SRectangle (const SRectangleS) { 
-SRectangle | count—; ) 
void drawO const { 

cout << "SRectangle: :draw 1) " < 



int quantity ( ) { 



class SEllipse : public Shape { 

void operator- (SEllipseS) ; // Disallow 
protected: 
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SEllipseO I count + + ; } 
SEllipse (const SEllipseS) { 
-SEllipseO I count—; 1 
void drawO const { 

cout << "SEllipse: :draw 1 ) " 
1 
static int quantity ( ) { retu 



class SCircle : public SEllipse { 

void operator- (SCirclefi) ; // Disallow 
protected: 

public: 

SCircle I count+t; 1 

SCircle (const SCircleS) { count + +; 1 

-SCircle ! count—; 1 

void drawl) const { 

cout << "SCircle: :draw 1 ) " << endl ; 



quantity ( ) { 



nt mainl) { 
vector<Shape*> shapes; 

srand(time(0) ) ; // Seed random number 
const int mod = 12; 

// Create a random quantity of each ty 
for(int i = 0; i < rand ( ) % mod; i++) 

shapes .push_back (new SRectangle) ; 
for(int j = 0; j < rand ( ) % mod; j++) 

shapes .push_back (new SEllipse); 
for(int k = 0; k < rand ( ) % mod; k++) 

shapes .push_back (new SCircle); 
int nCircles = 0; 
int nEllipses = 0; 
int nRects = 0; 
int nShapes = 0; 
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rlint u = 0; u < shapes . s i ze () ; u++) { 

shapes [u]->drawl) ; 

if (dynamic_cast<SCircle*> (shapes [u] ) ) 

nCircles++; 
if (dynamic_cast<SEllipse'-> (shapes [u] ) ) 

iiEllipses + + ; 
if (dyiiamic_cast<SRectangle'-> (shapes [u] ) ) 

nRects++; 
if (dynamic_cast<Shape*> (shapes [u] ) ) 

nShapes++; 

ut « endl « endl 



"Circles 


= '■ 


« nCircle 


"Ellipses 


= ' 


« nEllip 


"Rectangl 


es = 


" << nRec 


"Shapes = 




< nShapes 


endl 






"SCircle: 


:qua 


ntityO = 


SCircle: : 


quan 


tityO « 


"SEllipse 


: :qu 


antityO = 


SEllipse: 


:qua 


ntityO « 



<< "SRectangle: [quantity = " 
<< SRectangle: [quantity << endl 
<< "Shape: [quantity () = " 
<< Shape: [quantity () << endl ; 
purge (shapes) ; 
( ///:- 

Both types work for this example, but the static member approach can be used only if you 
own the code and have installed the static members and functions (or if a vendor provides 
them for you). In addition, the syntax for RTTl may then be different from one class to 
another. 



Syntax specifics 



typeid( ) with built-in types 

[ to n sisleD c V , til ( If p eiJ I ) operator works with built-in types. So the following 



expressions are true: 

I II-. COB iTypeidAndBuiltins . cpp 
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^include <cassert> 
#include <typeinfo 
using namespace st. 



a33ert(typeid(47) == typeid (int ) ) ; 
a33ert(typeid(0) == typeid (int )) ; 

as3ert(typeid(i) == typeid (int )) ; 
assert (typeid(Si) == typeid ( int* )) ; 
} ///:- 

Producing the proper type name 

typeid( ) must work properly in all situations. For example, the following class 
nested class: 

// : COB :RTTIandNesting.cpp 
#include <iostream> 
#include <typeinfo> 

using namespace std; 

class One { 

Nested* n; 
public: 

One() : n (new Nested) { 1 

~One() I delete n; } 

Nested* nestedl) { return n; 1 
}; 

int mainl) { 
One o; 
cout « typeidl*o. nestedl) ) .name 1) « endl ; 



The typeinfo::naine( ) member function will still produce the proper class name; the rt 
One::Nested. 



Nonpolymorphic types 



A lit B g li typeid( ) works with nonpolymorphic types {those that don't have a virtual function 
in the base class), the information you get this way is dubious. For the following class 
hierarchy. 
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// : COB :RTTIWithoutPolymorphism.cpp 
linclude <cas3ert> 
linclude <typeinfo> 
using namespace std; 

class X { 
// . . . 



Y : public X { 



public: 
// . . . 



int mainO | 

X* xp = new Y; 

assert (typeidC-xp) == typeidlX)); 

assert (typeidC-xp) != typeidlY)); 
} ///:- 
'ou create an object of the derived type and upcast it. 



The typeid( ) operator will produce results, but not the ones you might expect. I 
there's no polymorphism, the static type information is used: 

I typeid(*xp) == typeid(X) 

typeid(*xp) != typeid(Y) 

RTTI is intended for use only with polymorphic classes. 

Casting to intermediate levels 

dynaniic_cast cm d t leM bo ill txinl \\fti i\ii . in m in lieiitm c f li it riii'l) ; « i[li 
levels, inluiD ediite l)-|ies. For ciii pie. 

// : COB :DynamicCast.cpp 

// Using the standard dynamic_cast operation 

#include <cassert> 

#include <typeinfo> 

using namespace std; 
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class D2 { 

virtual void bar ( ) {] 



class MI : public Dl, public D2 {]; 
class Mi2 : public MI {); 

int mainl) { 

02* d2 = new Mi2; 

Mi2'- mi2 = dynamic_cast<Mi2 *> ( d2 ) ; 

MI* mi = dynamic_cast<MI'-> ld2) ; 

Dl*- dl = dynamic_cast<Dl*> (d2) ; 

assert ltypeid(d2) != typeid lMi2 *" ) ) ; 

assert (typeid(d2) == typeidlD2M); 
} ///:- 



This has the extra complication of multiple inheritance. If you create an mil and upcast it to 
the root (in this case, one of the two possible roots is chosen), then the dj'naiiiic_cast back t( 
either of the derived levels MI or nii2 is successful. 



Dl* dl 

This is successful because D2 is actually pointing to an mil object, which 
subobject of type dl . 

Casting to intermediate levels brings up an interesting difference between dynaniic_cast and 
typeid( ). typeid( ) always produces a reference to a typeinfo object that describes the cxaci 
type of the object. Thus it doesn't give you intermediate -level information. In the following 
expression (which is true), typeid( ) doesn't see d2 as a pomter to the derived type, like 
dynaniic_casl does: 



fromo 


ne root to the other: 


= dyn 


amic_cast<Dl*> (d 


becaus 


D2 is actually pointing 



typeid(d2) != typeidlMi2*) 
The type of D2 is simply the exact type of the pointer: 

I typeidld2) == typeidlD2*) 
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void pointers 

Run-time type identification doesn't work with void pointers: 



// : GO 8 :Voidj:tti . cpp 
// RTTI S void pointers 
linclude <iostream> 
#include <typeinfo> 

using namespace std; 

class Stimpy { 

public: 

virtual void happy () { } 
virtual void joy () {} 
virtual -Stimpy () { } 



int mainl) { 

void* V = new Stimpy; 

// Error: 
//! Stimpy* s = dynamic_cast<Stimpy *> ( v) ; 

// Error: 
//! cout « typeid(*v) .nameO « endl; 
} III:- 
A void* truly means "no type information at all." 

Using RTTI with templates 

Teit pliin ftntriK [« i ny d iffnen 1 di si d im t s. u d lo in etiiu e s y o Td liW lo pri 
ioftri itioB ibout w hit cliSM«x''c in . S T T I |i ro v id es i co n v tg it n t v ly to do tl 
follow ig; eim pie rci iiiti tb e cod e in Chiptei X X lo |i riit o H tli t o rd m o f co n t 
deitrictor cilh i irhon t i lin j i prepuctnor micro: 

// : COB : Constructor Order. cpp 
// Order of constructor calls 
#include <iostream> 
#include <tYpeinfo> 
using namespace std; 

template<int id> class Announce { 
public: 

Announce () { 

cout « typeid(*this) .name 1) 
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-Announce () { 

cout « typeidl'this) .n 



class X 


: public Announce<0> { 


Annou 


ice<l> ml; 


Annou 


ice<2> m2; 


public: 




X() 1 


cout « "X::X()" « endl ; } 


~X1) 


cout « "X::~X1)" « endl; 



nt mainl) { X x; } ///:- 

The <typeinfo> header must be included to call any member functions for the typeinfo object 
returned by typeid( ). The template uses a constant int to differentiate one class from another, 
but class arguments will work as well. Inside both the constructor and destructor, RTTl 
information is used to produce the name of the class to print. The class X uses both 
inheritance and composition to create a class that has an interesting order of constructor and 
destructor calls. 

This technique is often useful in situations when you're trying to understand how the 
language works. 



References 



I n I I nl id)Dn 10 1 i> 111 10 t lit I III! [ilniKts. I It t g i tinl i t D li i poiiltri ii 
[ilMiKH giiiii hiMitt I [ilt[ii(( li III i)'t dt[ilt[ii(td lo[ pii h) lit til pilt[, 
I I i[t 1 i I p 1 ii l! ['i Ij f ! o/- the type it points to may be examined. Here's an exan:q>le: 

// : COB :RTTIwithReferences . cpp 
#include <cassert> 

#include <typeinfo> 
using namespace std; 

class B { 
public: 

virtual float fl) { return 1.0;) 

virtual -Bl) {] 
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D 


public B 


{ 




V 1; 


ain 


) { 










P = 


new 


D; 








r = 


*p; 










ert 


type 


idlp) 


= = 


type 


dlB*-) ) ; 


ert 


type 


id(p) 


'.= 


typeidlD--) ); 


ert 


type 


id(r) 


= = 


typeid(D) ); 


ert 


type 


id(*p 


= 


typeid(D) ); 


ert 


type 


id(*p 


! 


typeid(B) ) ; 


ert 


type 


id(Sr 


= 


typeidtB--) ) 


ert 


type 


id(Sr 


! 


typeidiD--) ) 


ert 


type 


id(r.f 


== typeidlf 



Whereas the type of pointer that tj'peid( ) sees is the base type and not the derived type, the 
type it sees for the reference is the derived type: 

typeid(p) == typeid(B*) 
typeid(p) != typeid(D*) 
typeid(r) == typeid(D) 

Conversely, what the pointer points to is the derived type and not the base type, and taking the 
address of the reference produces the base type and not the derived type: 



(*p) 


== typeidlD) 


(*p) 


!= typeidlB) 


(fir) 


= = typeidlB*-) 


(fir) 


!= typeidlD*-) 



Expressions may also be used with the typeidC ) operator because they have a type a; 

I typeidlr.f 1) ) == typeid ( float ) 



Exceptions 



W b ( II )ot fetfomi i dj'iianiic_cast to a reference, the result must be assigned to a reference. 
But what happens if the cast fails? There are no null references, so this is the perfect place to 
throw an exception; the Standard C-H- exception type is bad_cast, but in the following 
example the ellipses are used to catch any exception: 

// : COB :RTTIwithExceptions . cpp 
#incl(ide <typeinfo> 
#include <iostream> 
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class X { public: virtual -X ( ) { ) ); 
class B { public: virtual -B ( ) { ) ) ; 
class D : public B { ) ; 

int mainl) { 
D d; 

B S b = d; // Upcast to reference 
try I 

Xfi xr = dynamic_cast<XS> (b) ; 
} catch (...) ! 

cout << "dynamic_cast<XS> (b) failed" 
« endl; 
) 

X* xp = 0; 
try I 

typeidC-xp) ; // Throws exception 
} catch lbad_typeid) { 

cout << "Bad typeid ( } expression" << 



The failure, of course, is because b doesn't actually point to an X object. If an exception was 
not thrown here, then xr would be unbound, and the guarantee that all objects or references 
are constructed storage would be broken. 

An exception is also thrown if you try to dereference a null pointer in the process of calling 
typeid( ). The Standard C++ exception is called bad_typeid. 

Here (unlike the reference example above) you can avoid the exception by checking for a 
o pointer value before attempting the operation; this is the preferred practice. 



Multiple inheritance 

li i t [ilii ( ( , ii [ III d ii • virtual base classes: 

I // : C08 :RTTIandMultipleInheritance.cpp 
linclude <iostream> 
linclude <typeinfo> 



class BB { 
public: 

virtual void f () {1 
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tual ~BB() {) 



virt 


.al p 


Jiblic 


BB { 1 ; 


virt 


.al p 


Jiblic 


BB { 1 ; 


publ 


ic Bl 


publ 


ic B2 { 1 ; 


{ 


MI; 


/ Upc 


ast 


nam 


B det 


Bction 




ypeid('bbp) .nair 


el) « end 


c_ca 


3t works pr 


operly : 


dyn 


ainic_ 


-ast<MI*>(bbp) ; 


fore 


B old 


-style 


cast: 


ip2 


= (MI 


)bbp, 


// Compil 



typeid( ) properly detects the name of the actual object, even through the virtual base class 
pointer. The dynaniic_cast also works correctly. But the compiler won't even allow you to 
try to force a cast the old way: 

I MI* mip = (Ml'-jbbp; // Compile-t ime error 

It knows this is never the right thing to do, so it requires that you use a dynaniic_cast. 

Sensible uses for RTTI 



till 

Ig. Fd 



ii|i 



! i, 1 



i Wi 



I Id 



ij fdf It 



11! f 



1 p[ 



'.til 



Iki. 



H , 



11) lillic 



iifii 



1 switch statements. They could accomplish this with RTTI and thus 
lose the very important value of polymorphism in code development and maintenance. The 
intent of C++ is that you use virtual functions throughout your code, and you only use RTTI 
when you must. 

However, using virtual functions as they are intended requires that you have control of the 
base-class definition because at some point in the extension of your program you may 
discover the base class doesn't include the virtual function you need. If the base class comes 
from a library or is otherwise controlled by someone else, a solution to the problem is RTTI: 
You can inherit a new type and add your extra member function. Elsewhere in the code you 
can detect your particular type and call that member function. This doesn't destroy the 
polymorphism and extensibility of the program, because adding a new type will not require 
you to hunt for switch statements. However, when you add new code in your main body that 
requires your new feature, you'll have to detect your particular type. 
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Putting a feature in a base class might mean that, for the benefit of one particular class, all the 
other classes derived from that base require some meaningless stub of a virtual function. This 
makes the interface less clear and annoys those who must redefine pure virtual functions 
whea they derive from that base class. For example, suppose that in the WindSxpp program 
in Chapter XX you wanted to clear the spit valves of all the instruments in your orchestra that 
had them. One option is to put a virtual ClearSpitValve( ) function in the base class 
Inslrument, but this is confusing because it implies that Percussion and electronic 
instruments also have spit valves. RTTI provides a much more reasonable solution in this case 
because you can place the function in the specific class (Wind in this case) where it's 
appropriate. 

Finally, RTTI will sometimes solve efficiency problems. If your code uses polymorphism in a 
nice way, but it turns out that one of your objects reacts to this general-purpose code in a 
horribly inefficient way, you can pick that type out using RTTI and write case-specific code 
to improve the efficiency. 

Revisiting the trash recycler 

H en's ilie trnt rtcjclinf sim n litioii from C t ip im X S , re* [ilK ii to nst RTTI insteiJ of 
bnildine tlit iofcriD ilioii into tlit diss Htnrcliy: 

// : COB :Recycle2 . cpp 

// Chapter XX example w/ RTTI 

linclude ". . /purge. h" 

linclude <fstream> 

linclude <vector> 

linclude <typeinfo> 

linclude <cstdlib> 

linclude <ctime> 

using namespace std; 

float _weight; 
public: 

Trash (float wt ) : _weightlwt) {] 
virtual float value () const = D; 
float weight 1) const { return _weight; ] 
virtual -Trash 1) { out « "-Trash l)\n"; } 

}; 

class Aluminum : public Trash { 
static float val ; 
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Aluminum (float wt } : Trash (wt) {] 
float value 1) const { return val ; 
static void value lint newval) { 



float Aluminum: :val = 1.67; 

class Paper : public Trash { 

static float val ; 
public: 

Paper(float wt ) : Trashlwt) {] 

float value 1) const { return val ; 

static void value lint newval ) { 



float Paper: :val = 0.10; 

class Glass : public Trash { 

static float val; 
public: 

Glass Ifloat wt ) : Trashlwt) {] 

float value 1) const { return val; 

static void value lint newval) { 



float Glass: :val = 0.23; 

// Sums up the value of the Trash in a bin: 
template<class Container> void 
sumValue IContainerE bin, ostreamS os) { 
typename Container :: iterator tally = 

bin.beginl); 
float val = 0; 
while(tally != bin.endl)) { 

val += (*tally)->weight 1) * 1 *tally )-> value 1 ) ; 
OS « "weight of " 

« typeidl'-tally) .name 1) 
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l*tally)->weight () 



"Total value = 



1) { 



srand (time (0) ) ; // Seed random numbe 
vector<Trash*> bin; 
// Fill up the Trash bin: 
for(int 1=0; i < 30; i++) 
switchlrandl) % 3) { 



bin.push_back (new Alum 
break; 

bin.push_back (new Pape 
break; 

bin.push_back (new Glas 
break; 



nd() % 100) ) ; 



nd() % 100) ) ; 



nd() % 100) ) ; 



// Note difference w/ chapter 14: Bins hold 

// exact type of object, not base type: 

vector<Glass*> glassBin; 

vector<Paper*> paperBin; 

vector<Aluminum*> alBin; 

vector<Trash*>: :iterator sorter = bin. begin ( 

// Sort the Trash: 

while(sorter != bin . end ( ) ) { 

Aluminum* ap = 

dynamic_cast<Aluminum'-> l* sorter) ; 

Paper* pp = 

dynamic_cast<Paper*> (^sorter) ; 

Glass* gp = 

dynamic_cast<Glass*> l*sorter) ; 

if lap) alBin.push_back(ap) ; 

if (pp) paperBin. push_back(pp) ; 

if (gp) glassBin. push_backlgp) ; 



} 

sumValue (alBin, out) ; 
sumValue (paperBin, out) ; 
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sumValue (glassBin, out) ; 
sumValue (bin, out) ; 
purge (bin) ; 
} ///:- 

The nature of this problem is that the trash is thrown unclassified into a single bin, so the 
specific type information is lost. But later, the specific type information must be recovered to 
properly sort the trash, and so RTTI is used. In Chapter XX, an RTTI system was inserted into 
the class hierarchy, but as you can see here, it's more convenient to use C-M-"s built-in RTTI. 

Mechanism & overhead of 
RTTI 



.1 [1( typeinfo Structure for that particular type. (Only one instance of the 
typeinfo structure is created for each new class.) So the effect of a typeid( ) expression is 
quite simple: The VPTR is used to fetch the typeinfo pointer, and a reference to the resulting 
typeinfo structure is produced. Also, this is a deterministic process - you always know how 
long it's going to take. 

For a dynaniic_cast<destination*>(source_pointer), most cases are quite straightforward: 
souree_pointer's RTTI information is retrieved, and RTTI information for the type 
destination'' is fetched. Then a library routine determines whether sonrce_pointer' s type is 

of type destination* or a base class of destination*. The pointer it returns may be slightly 
adjusted because of multiple inheritance if the base type isn't the first base of the derived 
class. The situation is (of course) more complicated with multiple inheritance where a base 
type may appear more than once in an inheritance hierarchy and where virtual base classes are 

Because the library routine used for dynainic_cast must check through a list of base classes, 
the overhead for dynaniic_cast is higher than typeid( ) (but of course you get different 
information, which may be essential to your solution), and it's nondetermmistic because it 
may take more time to discover a base class than a derived class. In addition, dynamic_cast 
allows you to compare any type to any other type; you aren't restricted to comparing types 
within the same hierarchy. This adds extra overhead to the library routine used by 
dynaniic_cast. 

Creating your own RTTI 
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exception handling was added because exceptions require exact knowledge of type 
information). 

Essentially, RTTI requires only a virtual function to identify the exact type of the class, and a 
function to take a pointer to the base type and cast it down to the more derived type; this 
function must produce a pointer to the more derived type. (You may also wish to handle 
references.) There are a number of approaches to implement your own RTTI, but all require a 
unique identifier for each class and a virtual function to produce type information. The 
following uses a static member function called dynacast( ) that calls a type information 
fiinction dynainic_type( ). Both functions must be defined for each new derivation: 

// : C08 : Self rtti.cpp 
// Your own RTTI system 
#include ". . /purge. h" 
#include <iostream> 

#include <vector> 

class Security { 
protected : 

static const int baselD = 1000; 
public: 

virtual int dynamic_type ( int id) | 
if (id == baselD) return 1 ; 
return 0; 



class Stock 


: public 


Secur 


ty { 




protected: 










static cor 


St int t 


ypelD 


= baselD 


+ 


public: 










int dynami 


c_type (i 


nt id) 


{ 




if (id = = 


typelD) 


retur 


1 1; 




return . 


ecurity: 


dynam 


c_type ( 


i-d) 



atic Stock*- dynacast (Security*- s) | 
if (s->dynamic_type (typelD) ) 
return (Stock*) s; 
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protected: 

static const int typelD = baselD + 2 ; 
public: 

int dynamic_type (int id) { 
if (id == typelD) return 1; 
return Security : : dynamic_type (id) ; 
1 

static Bond-- dynacast (Security *" s) { 
if (s->dynamic_type (typelD) ) 



class Commodity : public Security { 
protected: 

static const int typelD = baselD + 3; 
public: 

int dynamic_type (int id) { 

if (id == typelD) return 1; 

return Security: : dynamic_type (id) ; 
1 
static Commodity'- dynacast (Security* s) | 

if (s->dynamic_type (typelD) ) 
return (Commodity *") s ; 



id special { 

cout << "special Commodity fu. 



class Metal : public Commodity { 
protected: 

static const int typelD = baselD + 4; 
public: 

int dynamic_type (int id) { 
if (id == typelD) return 1; 
return Commodity: : dynamic_type (id) ; 
1 

static Metal* dynacast (Security* s) { 
if (s->dynamic_type (typelO) ) 
return (Metal*) s; 
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olio.push_back(ne 
olio.push_back(ne 
olio.push_back(ne 
olio.push_back(ne 



ndO) { 



} 



po 



Security* sp = new Metal; 

Commodity* cp = Commodity :: dynacast (sp ) ; 
if (cp) cout << "it's a Commodity \n" ; 
Metal * mp = Metal : : dynacast (sp ) ; 
if(mp) cout << "it's a Metal too ! \n" ; 
purge (portfolio) ; 
} ///:- 

Each subclass must create its own typelD, redefuie the virtual d)'naniic_type( ) function to 
return that typelD, and define a static member called dynacast( ), which takes the base 
pointer (or a pointer at any level in a deeper hierarchy — in that case, the pointer is simply 

In the classes derived from Security, you can see that each defines its own typelD 
enumeration by adding to baselD. It's essential that baselD be directly accessible in the 
derived class because the enum must be evaluated at compile-time, so the usual approach of 
reading private data with an inline function would fail. This is a good example of the need for 
the protected mechanism. 

The enum baselD establishes a base identifier for all types derived from Security. That way, 
if an identifier clash ever occurs, you can change all the identifiers by changing the base 
value. (However, because this scheme doesn't compare different inheritance trees, an 
identifier clash is unlikely). In all the classes, the class identifier number is protected, so it's 
directly available to derived classes but not to the end user. 

This example illustrates what built-in RTTI must cope with. Not only must you be able to 
determine the exact type, you must also be able to find out whether your exact type is derivird 
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from llie type you're looking for. For example. Metal is derived from Commodity, which has 
a function called special( ), so if you liave a Metal object you can call special( ) for it. If 
dynaniic_type( ) told you only the exact type of the object, you could ask it if a Metal were a 
Commodily, and it would say "no," which is untrue. Therefore, the system must be set up so 
it will properly cast to intermediate types in a hierarchy as well as exact types. 

The dynacast( ) function determines the type information by calling the virtual 
dynaniic_type( ) function for the Security pointer it's passed. This function takes an 
argument of the typelD for the class you're trying to cast to. It's a virtual function, so the 
function body is the one for the exact type of the object. Each dynaiiiic_type( ) function first 
checks to see if the identifier it was passed is an exact match for its own type. If that isn't true, 
it must check to see if it matches a base type; this is accomplished by making a call to the 
base class dynaiiiic_type( ). Just like a recursive fiinction call, each dyiiaiiiic_lype( ) checks 
against its own identifier. If it doesn't find a match, it returns the result of calling the base 
class dynainic_type(). When the root of the hierarchy is reached, zero is returned to indicate 
no match was found. 

If dyiiamic_type( ) returns one (for "true") the object pointed to is either the exact type 
you're asking about or derived from that type, and dynacast( ) takes the Security pointer and 
casts it to the desired type. If the return value is false, dyiiacast( ) returns zero to indicate the 
cast was unsuccessful. In this way it works just like the C++ dynamic_cast operator. 

The C++ dynaiiiic_cast operator does one more thing the above scheme can't do: It compares 
types tiom one inheritance hierarchy to another, completely separate inheritance hierarchy. 
This adds generality to the system for those unusual cases where you want to compare across 
hierarchies, but it also adds some complexity and overhead. 

You can easily imagine how to create a DYNAMIC_CAST macro that uses the above scheme 
and allows an easier transition to the built-in dynamic_cast operator. 



Explicit cast syntax 



I 1 1 1 ! I ( [ (11 1 SI I list, )■ n '[( t rt 1 i ii ; ll ( ij f ; sv lit i . '^'^ You're telling the compiler that 
even though you know an object is a certain type, you're going to pretend it is a different 
type. This is an inherently dangerous activity, and a clear source of errors. 

Unfortunately, each cast is different: the name of the pretender type surrounded by 
parentheses. So if you are given a piece of code that isn't working correctly and you know 
you want to examine all casts to see if they're the source of the errors, how can you guarantee 
that you find all the casts? In a C program, you can't. For one thing, the C compiler doesn't 
always require a cast (it's possible to assign dissimilar types through a void pointer without 



' See Josee Lajoie , 'The new cast nolalioii and the bool datatype," C++ Report, September, 
1994 pp. 46-51. 
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ast), and the casts all look different, so you c 



being forced to use : 
searched for every o 

To solve this problem, C++ provides a consistent casting syntax using four reserved words: 
dynaiiiic_cast (the subject of the first part of this chapter), const_cast, static_cast, and 
reinterpret_cast. This window of opportunity opened up when the need for dynamic_cast 
arose - the meaning of the existing cast syntax was aheady far too overloaded to support any 
additional functionality. 



By using these casts instead of the (newtype) syntax, you c; 
any program. To support existing code, most compilers hav 
generation that can be turned on and off. But if you turn on 
syntax, you can be guaranteed that you'll find all the places 
which will make bug-hunting much easier. 

The following table describes the different forms of casting 



1 easily search for all the casts in 
various levels of error/warning 
[ill errors for the explicit cast 
n your project where casts occur. 



static_cast 


For "well-behaved" and "reasonably well- 
behaved" casts, including things you 
might now do without a cast (e.g., an 
upcast or automatic type conversion). 


const_cast 


To cast away const and/or volatile. 


dynaniic_cast 


For type-safe downcasting {described 
earlier in the chapter). 


reinterpret_cast 


To cast to a completely different meaning. 
The key is that you'll need to cast back to 
the original type to use it safely. The type 
you cast to is typically used only for bit 
twiddlmg or some other mysterious 
purpose. This is the most dangerous of all 
the casts. 



will be described more completely in the following 



Summary 
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1. You don't hiive to build it into your own libraries. 
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2. You don't have to worry whether it will be built into someone else's library. 

3. You don't have the estra programming overhead of maintaining an RTTI 
scheme during inheritance. 

4. The syntax is consistent, so you don't have to figure out a new one for each 

While RTTI is a convenience, like most features in C++ it can be misused by either a naive or 
determined programmer. The most common misuse may come from the programmer who 
doesn't understand virtual functions and uses RTTI to do type-check coding instead. The 
philosophy of C++ seems to be to provide you with powerful tools and guard for iype 
violations and integrity, but if you want to deliberately misuse or get around a language 
feature, there's nothing to stop you. Sometimes a slight burn is the fastest way to gain 

The explicit cast syntax will be a big help during debugging because casting opens a hole into 
your type system and allows errors to slip in. The explicit cast syntax will allow you to more 
easily locate these error entryways. 



Exercises 



Modify C16:AutoCounter.h in volume 1 of this book so that it becomes a 
useful debugging tool. It will be used as a nested member of each class that 
you are interested in tracing. Turn AutoCounter into a template that takes 
the class name of the surrounding class as the template argument, and in all 
the error messages use RTTI to print out the name of the class. 
Use RTTI to assist in program debugging by printing out the exact name of 
a template using typeid( ). Instantiate the template for various types and see 
what the results are. 

Implement the function TuniColorIfYouAreA( ) described earlier in this 
chapter using RTTI. 

Modify the Instrument hierarchy from Chapter XX by first copying 
WindS.cpp to a new location. Now add a virtual ClearSpitValve( ) 
function to the Wind class, and redefine it for all the classes inherited from 
Wind. Instantiate a TSIash to hold Instrument pointers and fill it up with 
various types of Instrument objects created using new. Now use RTTI to 
move through the container looking for objects in class Wind, or derived 
from Wind. Call the ClearSpitValve{ ) function for these objects. Notice 
that it would unpleasantly confuse the Instrument base class if it contained 
a ClearSpitVaIve( ) function. 
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9: Building stable 
systems 

Shared objects & reference 
counting 

Reference-counted class hierarchies 

Finding memory leaks 

1. For array bounds checking, use the Array template in CI6:Array3.cpp of Volume I 
for all arrays. You can turn off the checking and increase efficiency when you're 
ready to ship. (This doesn't deal with the case of taking a pointer to an array, though 
- perhaps that could be teraplatized somehow as well). 

2. Use the CIO:MemCheck {wrong chapter number) to guarantee that dynamic memory 
is released properly. 



ii-virtual destructors in base classes. 



The canonical object & singly- 
rooted hierarchies 

An extended canonical form 

Design by contract 
Integrated unit testing 
Dynamic aggregation 

[[ This may actually be the "builder" design pattern in some form ]] 

The examples we've seen so far are illustrative, but fairly simple. It's useful to see an 
example that has more complexity so you can see that the STL will work in all situations. 

[[ Add a factory method that takes a vector of string]] 

The class that will be created as the example will be reasonably complex: it's a bicycle which 
can have a choice of parts. In addition, you can change the parts during the lifetime of a 
Bicycle object; this includes the ability to add new parts or to upgrade from standard -quality 
parts to "fancy" parts. The BicycIePart class is a base class with many different types, and the 
Bicycle class contains a vector<BicycIePart*> to hold the various combination of parts that 
may be attached to a Bicycle: 

//: CD9:Bicycle.h 

// Complex class involving dynamic aggregation 

#ifndef BICYCLE_H 

#define BICYCLE_H 

#include <vector> 

#include <string> 

#include <iostream> 

#include <typeinfo> 



public: 
LeakChei 
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void print 1) { 

} 

-LeakChecker 1) { print () ; 
void operator + + lint) { cou 
void operator— lint) { cou 



class BicyclePart { 

static LeakChecker Ic; 
public: 

BicyclePartO ! lc + + ; } 

virtual BicyclePart* clone 1 ) = ; 

virtual -BicyclePartO { Ic — ; } 

friend std: :ostreamS 

operator<< (std: :ostreamfi os, BicyclePart* bp ) { 
return os « typeid I *bp ) . name I ) ; 



friend class Bicycle 



num BPart { 

Frame, Wheel, Seat, HandleBar, 
Sprocket, Deraileur, 



template<BPart id> 

class Part : public BicyclePart { 

public: 

BicyclePart* clone I) | return new Part<id>; 



class Bicycle { 
public: 

typedef std : : vector<BicyclePart* > VBP ; 

Bicycle 0; 

Bicycle (const Bicycles old) ; 

Bicycles operator- (const Bicycles old) ; 

// [Other operators as needed go here:] 

// [...] 

// [...] 

-Bicycle 1) { purge 1) ; ) 

// So you can change parts on a bike Ibut be 

// careful: you must clean up any objects yo" 
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// remove from the bicycle! 
VBPfi bikeParts 1) ! return p 
friend std: :ostreamS 
operator<< (std: :o3treamS os 
static void print (std :: vect 
std: :ostreamS os = std::c 



cle 



idea. 
ycleGenerator 
operator () () 



#endif // BICYCLE_H ///:- 

The operator« for ostream and Bicycle moves through and calls the operator« for each 
BicycleParl, and that prints out the class name of the part so you can see what a Bicycle 
contains. The BkyclePart::cloDe( ) member function is necessary in the copy-constructor of 
Bicycle, since it just has a vector<BicyclePart*> and wouldn't otherwise know how to copy 
the BicycleParts correctly. The clonmg process, of course, will be more involved when there 
are data members in a BicyclePart. 

BicyclePartxpartcount is used to keep track of the number of parts created and destroyed 
(so you can delect memory leaks). It is incremented every time a new BicyclePart is created 
and decremented when one is destroyed; also, when partcount goes to zero this is reported 
and if it goes below zero there will be an assert( ) failure. 

If you want to change BicycleParts on a Bicycle, you just call Bicycle::bikeParts( ) to get 
the vector<BicyclePart*> which you can then modify. Bui whenever you remove a part from 
a Bicycle, you must call delete for that poinier, olherwise it won't get cleaned up. 



// : C0 9:Bicycle.cpp {01 
// Bicycle implementatii 
#include "Bicycle. h" 
#include <map> 
#include <algorithm> 
#include <cassert> 
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// static member definitions: 
LeakChecker BicyclePart : : Ic; 
int Bicycle: [counter = ; 

Bicycle: :Bicycle 1) : id ( counter + + ) | 

BicyclePart '"bp [ ] = { 
new Part<Frame>, 

new Part<Wheel>, new Part<Wheel>, 
new Part<Seat>, new Part<HandleBar>, 
new Part<Sprocket>, new Part<Deraileur> , 

1; 

const int bplen = sizeof bp / sizeof *bp; 

parts = VBP (bp, bp + bplen); 



icycle: :Bicycle (const Bicycles old) 
: parts (old. parts. begin 1) , old . parts . end () ) { 
forlint i = 0; i < parts . size () ; i++) 
parts[i] = parts[i]->clonel) ; 



icyclea Bicycle :: operator- ( const Bicycles old) | 
purgeO; // Remove old lvalues 
parts. resize (old. parts. si zel) ) ; 
copy (old. parts .begin () , 

old. parts. end , parts. begin ( ) ) ; 
for(int i = 0; i < parts . size () ; i++) 

parts[i] = parts[i]->clone(); 



old Bicycle: :purge () ! 
for (VBP: : iterator it = parts . begin () ; 
it != parts. endl); it++) { 
delete '"it; 
*it = 0; // Prevent multiple delet 



reams operator<< ( ostreamS os. Bicycle* b) { 
opy lb->parts. begin () , b->parts . end ( ) , 
ostream_iterator<BicyclePart*> (os, "\n" ) ) ; 
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void Bicycle: :print lvector<Bicycle*>S vb, 

copy (vb .begin ( ) , vb . end ( ) , 

03tream_iterator<Bicycle'-> (os, "\n") ) ; 

cout « " " « endl; 

} ///:- 



Here's 



// : C0 9:BikeTe3t.cpp 
//{LI Bicycle 
#incli3de "Bicycle. h" 
#incliide <algorithm> 
using namespace std; 

int mainO | 

vector<Bicycle'-> bikes; 
BicycleGenerator bg ; 

generate_n lback_inserter (bikes) , 12, bg) ; 
Bicycle: : print (bikes) ; 
( ///:- 



Exercises 



Create a heap compactor for all dynamic memory in a particular program. 
This will require that you control how objects are dynamically created and 
used (do you overload operator new or does that approach work?). The 
typically heap -compact ion scheme requires that all pointers are doubly- 
indirected (that is, pointers to pointers) so the "middle tier" pointer can be 
manipulated during compaction. Consider overloading openttor-> to 
accomplish this, since that operator has special behavior which will 
probably benefit your heap -compaction scheme. Write a program to test 
your heap -compact ion scheme. 
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10: Design patterns 

". . . describes a problem which occurs over and over again 
in our environment, and then describes the core of the 
solution to that problem, in such a way that you can use this 
solution a miUion times over, without ever doing it the same 
way twice" - Christopher Alexander 

This chapter introduces the important and yet non-traditional 
"patterns" approach to program design. 

[[ Much of the prose in this chapter still needs work, but the examples all compile. Also, more 
patterns and examples are forthcoming ]] 

Probably the most important step forward in object-oriented design is the "design patterns" 
movement, chronicled in Design Patterns, by Gamma, Helm, Johnson & Vlissides (Addison- 
Wesley 1995).^* Thai book shows 23 different solutions to particular classes of problems. In 
this chapter, the basic concepts of design patterns will be introduced along with examples. 
This should whet your appetite to read Design Patterns (a source of what has now become an 
essential, almost mandatory, vocabulary for OOP programmers). 

The latter part of this chapter contains an example of the design evolution process, starling 
with an initial solution and moving through the logic and process of evolving the solution lo 
more appropriate designs. The program shown (a trash recycling simulation) has evolved over 
time, and you can look at that evolution as a prototype for the way your own design can start 
as an adequate solution to a particular problem and evolve into a flexible approach to a class 
of problems. 

The pattern concept 
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Although they're called "design patterns," they really aren't tied to the realm of design. A 
pattern seems to stand apart from the traditional way of thinking about analysis, design, and 
implementation. Instead, a pattern embodies a complete idea within a program, and thus it can 
sometimes appear at the analysis phase or high-level design phase. This is interesting because 
a pattern has a direct implementation in code and so you might not expect it to show up before 
low-level design or implementation (and in fact you might not realize that you need a 
particular pattern until you get to those phases). 

The basic concept of a pattern can also be seen as the basic concept of program design: adding 
layers of abstraction. Whenever you abstract something you're isolating particular details, and 
one of the most compelling motivations behind this is to separate things that change from 
things that stay the same. Another way to put this is that once you find some part of your 
program that's likely to change for one reason or another, you'll want to keep those changes 
from propagating other modifications throughout your code. Not only does this make the code 
much cheaper to maintain, but it also turns out that it is usually simpler to understand (which 
results in lowered costs). 

Often, the most difficult part of developing an elegant and cheap -to -maintain design is in 
discovering what I call "the vector of change." (Here, "vector" refers to the maximum 
gradient and not a container class.) This means finding the most important thing that changes 
in your system, or put another way, discovering where your greatest cost is. Once you 
discover the vector of change, you have the focal point around which to structure your design. 

So the goal of design patterns is to isolate changes in your code. If you look at it this way, 
you've been seeing some design patterns already in this book. For example, inheritance could 
be thought of as a design pattern (albeit one implemented by the conq>iler). It allows you to 
express differences in behavior (that's the thing that changes) in objects that all have the same 
interface (that's what stays the same). Composition could also be considered a pattern, since it 
allows you to change - dynamically or statically - the objects that implement your class, and 
thus the way that class works. Normally, however, features that are directly supported by a 
programming language are not classified as design patterns. 

You've also already seen another pattern that appears in Design Patterns: the iterator. This is 
the fundamental tool used in the design of the STL; it hides the particular implementation of 
the container as you're stepping through and selecting the elements one by one. The iterator 
allows you to write generic code that performs an operation on all of the elements in a range 
without regard to the container that holds the range. Thus your generic code can be used with 
any container that can produce iterators. 

The singleton 

Possibly tlie siinpltsNesijD fillerii is lie singleton, which is a way to provide one and only 
one instance of an object: 

// : C0 9:SingletonPattern.cpp 
#include <iostream> 
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class Singleton { 

Singleton ( int x) : i (x) { } 
void operator- (Singletons) ; 
Singleton (const SingletonE ) ; 
public: 

static SingletonE getHandleO { 

} 

int getValue { return i; ) 

void setValue(int x) { i = x; ) 



Singleton S ingleton : : s ( 4 

int mainO { 

Singletons s = Singlet 
cout « 3. getValue () < 
Singletons s2 = Single 
s2.3etValue (9) ; 
cout « s. getValue () < 

} ///:- 



The key to creating a singleton is to prevent the client programmer from having any way lo 
create an object except the ways you provide. To do this, you must declare all constructors a: 
private, and you must create at least one constructor to prevent the compiler from 
synthesizing a defauh constructor for you. 



At this point, you decide how you're going to create your object. Here, it's created statically, 
but you can also wait until the client programmer asks for one and create it on demand. In any 
case, the object should be stored privately. You provide access through public methods. Here, 
getHandle( ) produces a reference to the Singleton object. The rest of the interface 
(getValue( ) and setValue( )) is the regular class interface. 

Note that you aren't restricted to creating only one object. This technique easily supports the 
creation of a limited pool of objects. In that situation, however, you can be confronted with 
the problem of sharing objects in the pool. If this is an issue, you can create a solution 
involving a checknaut and check-in of the shared objects. 

Variations on singleton 

Any static member object inside a class is an expression of singletoD: one and only one w ill 
be made. So In a sense, the language has d Irect support for the idea; we certaioly use it on a 
regular basis. How ever, there's a problem associated with static objects (m em ber or not), and 
that's the order of Initialization, as described in Volume I of this book. If one static object 
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In Volume 1, you were shown how a sialic object defined inside a function can be used to 
control initialization order. This delays the initialization of the object until the first time the 
function is called. If the function returns a reference to the static object, it gives you the effect 
of a singleton while removing much of the woiry of static initialization. For example, suppose 
you want to create a logfile upon the first call to a function which returns a reference to that 
logfile. This header file will do the trick: 



//: C09:LogFile.h 






#ifndef LOGFILE_H 






#define LOGFILE_H 






#include <fstream> 






inline std: :ofstreamfi 


logfile 1) ! 




static std: :ofstrea 


n log ("Logfile 


1 


return log; 







#endif // LOGFILE_H ///:- 



The implementation iniis! not be inlined, because that would 

including the static object definition within, could be duplicated in any transl; 

it's included, and you'd end up with multiple copies of the static object. This 



that the whole function, 

where 



certainly foil the attempts to control the ordi 
and hard4o-detect fashion). So the implt 

// : C0 9:LogFile.cpp |01 

#include "LogFile.h" 

std: :ofstreamS logf ile ( ) { 

return log; 

} ///:- 



n (but potentially ir 
:t be separate: 



WLlln 
file: 



Now the log objet 
the function in oni 

// : C09:UseLogl .h 
#ifndef USEL0G1_H 
#define USEL0G1_H 
void f 0; 

#endif // USEL0G1_H ///:- 

// : C0 9:UseLogl . cpp {O} 
#include "UseLogl.h" 
#include "LogFile.h" 
void f() { 

logfile 1) << FILE < 

} ///:- 
And again in another file: 



t be initialized until the first time logfile( ) is called. So if you u 
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// : C09:UseLog2 . cpp 
//!L} UseLogl LogFile 
linclude "UseLogl.h" 
linclude "LogFile. h" 



oid gl) { 
logfilel) << FILE_ 



int mainl) { 
f 1) ; 
gl); 

} III:- 



Then the log object doesn't get created until the first call to f( ). 



You can easily combine the creation of the static object inside a member function with the 
singleton class. SingletonPattem.cpp can be modified to use this approach: 

// : CO 9: S ingletonPattern2 .cpp 
#include <io3tream> 
using namespace std; 

class Singleton { 

Singleton lint x) : i (x) { } 

void operator- (SingletonE) ; 
Singleton (const Singletons); 
public: 

static Singletons getHandle () { 
static Singleton s (47) ; 



getValuel) { return 
d setValue (int x) { 



Singletons s = S ingleton :: getHandle () ; 
cout << s. getValuel) << endl ; 
Singletons s2 = Singleton :: getHandle () ; 
32. setValue ( 9 ) ; 

cout << s. getValuel) << endl ; 
} ///:- 
An especially interesting case is if two of these singletons depend on each other, like this: 
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// : CO 5: Functions tat icEinglet 

class Singletonl { 

Singletonll) {] 
public: 

static SingletonlG ref ( ) | 
static Singletonl single; 



class Singleton2 { 

Singletonlfi si; 

Singleton2 (SingletonlS s) : slls) {) 
public: 

static Singleton2s ref ( ) { 

static Singleton2 s ingle ( S ingletonl :: ref ()) ; 



int mainl) { 

SingletonlS si = S ingleton2 : : ref ( ) . f ( ) ; 
} ///:- 

When Singletonl: :ref( ) is called, it causes ils sole Singletonl object to be created. In the 
processor this creation, Singletonl:: ref ( ) is called, and that causes the sole Singletonl 
object to be created. Because this technique doesn't rely on the order of Imking or loading, the 
programmer has much better control over initialization, leading to less problems. 

You'll see further examples of the singleton pattern in the rest of this chapter. 



Classifying patterns 



I i t Design Patterns book discusses 23 different patterns, classified under three purposes (all 
of which revolve around the particular aspect that can varyj. The three purposes are: 

1. Creationai: how an object can be created. This often involves isolating the details of 
object creation so your code isn't dependent on what types of objects there are and thus 
doesn't have to be changed when you add a new type of object. The aforementioned 
Singleton is classified as a creational pattern, and later m this chapter you'll see examples 
of Factory Method and Prototype. 
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2. Structural: designing objects to satisfy particular project constraints. These work with 
the way objects are connected witli other objects to ensure that changes in the system 
don't require changes to those connections. 

3. Behavioral: objects that handle particular types of actions within a program. These 
encapsulate processes that you want to perform, such as mterpreting a language, fulfilling 
a request, moving through a sequence (as in an iterator), or implementing an algorithm. 
This chapter contains examples of the Observer and the Visitor patterns. 

The Design Patterns book has a section on each of its 23 patterns along with one or more 
exan^les for each, typically in C-H- but sometimes in Smalltalk. This book will not repeat all 
the details of the patterns shown in Design Patterns since that book stands on its own and 
should be studied separately. The catalog and examples provided here are intended to rapidly 
give you a grasp of the patterns, so you can get a decent feel for what patterns are about and 
why they are so important. 



[[ Describe different form of categorization, based on what you want to accomplish rather 
than the way the patterns look. More categories, but should result in easier-to-understand, 
faster selection ]]] 



Features, idioms, patterns 
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Hiding types (polymorph is 



Factories: encapsulating object 
creation 
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The solution is to force the creation of objects to occur through a common/ac/ory rather than 
to allow the creational code to be spread throughout your system. If all the code in your 
program must go through this factory whenever it needs to create one of your objects, then all 
you must do when you add a new object is to modify the factory. 

Since every objectnarlented program creates objects, and since it's very likely you will extend 
your program by adding new types, I suspect that factories may be the most universally useful 
kinds of design patterns. 



//: C0 9:ShapeFactoryl 


cpp 


#incl 


de " . . /purge . h" 




#incl 


de <iostream> 




#incl 


de <string> 




#incl 


de <exception> 




#incl 


de <vector> 




using 


namespace std; 




class 


Shape { 




public: 




virtual void draw ( ) 


= 0; 


virtual void erase ( 


= 0; 


virtual -Shape 1) {] 





Chapter 16: Design Fatteiiis 



class BadShapeCreation : public exception { 

string reason; 
public: 

BadShapeCreation (string type) { 

reason = "Cannot create type " + type; 

1 

const char '"what () const { 



atic Shape* factory ( string type) 
throw (BadShapeCreation) ; 



class Circle : public Shape { 

Circle {} // Private constructor 
friend class Shape; 

public: 

void drawO I cout « "Circle :: draw\n" ; ] 
void eraseO | cout « "Circle :: eraseXn" ; ] 
-Circle I cout « "Circle :: ~Circle\n" ; } 



class Square : public Shape | 
Square {) 
friend class Shape; 

public: 

void drawO I cout « "Square :: draw\n" ; } 
void eraseO | cout « "Square :: eraseXn" ; ) 
-Square () | cout << "Square :: -Square\n" ; } 

}; 

shape* Shape :: factory ( string type) 
throw(Shape: : BadShapeCreation ) { 
if(type == "Circle") return new Circle; 
if (type == "Square") return new Square; 

throw BadShapeCreation (type) ; 



char* shlist [] = { "Circle", "Square", "Squa 
"Circle", "Circle", "Circle", "Square", "" 

int mainO { 

vector<Shape*> shapes; 
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for (char** cp = shlist; **cp; cp + +) 

shapes .push_back (Shape: [factory (*cp) ) ; 
} catch (Shape: :BadShapeCreation e) { 
cout « e.whatO « endl; 

return 1; 
1 
for(int i = 0; i < shapes . s i ze () ; i + +) ! 

shapes[i]->draw(); 

} 

purge (shapes) ; 
} III:- 

The factory( ) takes an argument that allows it to determine wliat type of Shape to create; it 
happens to be a striDg in this case but it could be any set of data. The factory( ) is now the 
only other code in the system that needs to be changed when a new type of Shape is added 
(the initiahzation data for the objects will presumably come from somewhere outside the 
system, and not be a hard-coded array as in the above example). 



To ensure that the creation can only happen in the factory( ), the constructors for the specific 
types of Shape are made private, and Shape is declared a friend so that factory( ) has access 
to the constructors (you could also declare only Shape ::factory( ) to be a friend, but it seems 
reasonably harmless to declare the entire base class as a friend). 



Polymorphic factories 



T 1: e static factory( ) inethod in the previous example foices all the creation operations to be 
focused in one spot, to that's the only place you need to change the code. This is certainly a 
reasonable solution, as it throws a box around the process of creating objects. However, the 
Design Patterns book emphasizes that the reason for the Factory Method pattern is so that 
different types of factories can be subclassed from the basic factory (the above design is 
mentioned as a special case). However, the book does not provide an example, but instead just 
repeats the example used for the Abstract Factory. Here is ShapeFactoryl.cpp modified so 
the factory methods are in a separate class as virtual functions: 

// : CO 9: ShapeFactory2 . cpp 

// Polymorphic factory methods 

#include ". . /purge. h" 

#include <iostream> 

#include <string> 

# include <exception> 

#include <vector> 

#include <map> 
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class Shape { 

virtual void draw ( ) = ; 
virtual void erase () = 0; 
virtual -Shape () { 1 



class ShapeFactory { 

virtual Shape* create () = 0; 

public: 

virtual -ShapeFactory 1) {} 

friend class ShapeFactory Ini z ial i zer; 
class BadShapeCreation : public exception { 

public: 

BadShapeCreation (string type) { 

] 

const char '"what () const { 



atic Shape* 

eateShape (string id) throw (BadShapeCreatio 
if (factories. find(id) != f actories . end ( ) ) 
return factories [ id] ->create () ; 

throw BadShapeCreation (id) ; 



// Define the static object: 

map<string, ShapeFactory *> 

ShapeFactory: :factories; 

class Circle : public Shape { 

Circle {} // Private constructor 

public: 

void drawO | cout « "Circle :: draw\n" ; ] 
void erase | cout « "Circle :: erase\n" ; ] 
-CircleO I cout « "Circle :: ~Circle\n" ; ] 
class Factory; 
friend class Factory; 
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class Factory : public ShapeFactory { 
Shape* create 1) { return new Circle; 



class Square : public Shape { 

Square 1) {) 
public: 

void drawO I cout « "Square :: draw\n" ; } 

void erase 1) | cout « "Square :: erase\n" ; 

-Squared I cout << " Square :: -Square\n" ; 1 

class Factory; 

friend class Factory; 

class Factory : public ShapeFactory { 

public: 

Shape*- create 1) { return new Square; } 



// Singleton to initialize the ShapeFact. 
class ShapeFactorylnizializer | 

static ShapeFactorylnizializer si; 
ShapeFactorylnizializer () | 

ShapeFactory: : factories ["Circle"] = 

new Circle: :Factory; 
ShapeFactory: : factories ["Square"] = 
new Square: :Factory; 



// Static member definition: 
ShapeFactorylnizializer 

ShapeFactorylnizializer: :si; 

char* shlistn = { "Circle", "Square", "Squa 
"Circle", "Circle", "Circle", "Square", "" 

int mainl) { 

vector<Shape*> shapes; 
try I 

for (char** cp = shlist; * * cp ; cp + +) 
shapes . push_back ( 

ShapeFactory: :createShape l*cp) ) ; 
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ch (ShapeFactory : iBadShapeCn 
t « e.what 1) « endl; 



} ///:- 

Now the factory method appears in its own class, ShapeFactory, as the virtual create( ). 
This is a private method which means it cannot be called directly, but it can be overridden. 
The subclasses of Shape must each create their own subclasses of ShapeFactory and 
override the create( ) method to create an object of their own type. The actual creation of 
shapes is performed by calling ShapeFactory: :crealeShape( ), which is a static method that 
uses the map in ShapeFactory to find the appropriate factory object based on an identifier 
that you pass it. The factory is immediately used to create the shape object, but you could 
imagine a more complex problem where the appropriate factory object is returned and then 
used by the caller to create an object in a more sophisticated way. However, it seems that 
much of the time you don't need the intricacies of the polymorphic factory method, and a 
single static method in the base class (as shown in ShapeFactory l.cpp) will work fine. 

Notice that the ShapeFactory must be initialized by loading its map with factory objects, 
which takes place in the singleton ShapeFactorylnizializer. So to add a new type to this 
design you must inherit the type, create a factory, and modify ShapeFactorylnizializer so 
that an instance of your factory is inserted in the map. This extra complexity again suggests 
the use of a static factory method if you don't need to create individual factory objects. 



Abstract factories 



I li e Abstract Factory pattern looks like the factory objects we've seen previously, with not 
one but several factory methods. Each of the factory methods creates a different kind of 
object. The idea is that at the point of creation of the factory object, you decide how all the 
objects created by that factory will be used. The example given in Design Patterns 
implements portability across various graphical user interfaces (GUIs): you create a factory 
object appropriate to the GUI that you're working with, and from then on when you ask it for 
a menu, button, slider, etc. it will automatically create the appropriate version of that item for 
the GUI. Thus you're able to isolate, in one place, the effect of changing from one GUI to 
another. 

As another example suppose you are creating a general -purpose gaming environment and you 
want to be able to support different types of games. Here's how it might look using an 
abstract factory: 

I // : C0 9:AbstractFactory .cpp 
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#include <iostream> 
using namespace std; 



class Player { 
public: 

virtual void interactWith ( Obstacle* ) = 



class Kitty: public Player { 

virtual void interactWith ( Obstacle* ob ) | 
cout << "Kitty has encountered a "; 



class KungFuGuy: public Player { 

virtual void interactWith (Obstacle* ob ) { 
cout << "KungFuGuy now battles against a 



class Puzzle: public Obstacle { 
public: 

void action 1) { cout « "Puzzle 



class NastyWeapon: public Obstacle { 

void action 1) { cout << " WastyWeapon\n 

// The abstract factory: 
class GameElementFactory ! 

virtual Player* makePlayer () = 0; 
virtual Obstacle* makeObstacle ( ) = 0; 

); 
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class KittiesAndPuzzles : 

public GameElementFactory | 
public: 

virtual Player* makePlayerl) { 

return new Kitty; 
1 

virtual Obstacle*" makeObstacle ( ) | 
return new Puzzle; 



class KillAndDismember : 

public GameElementFactory | 
public: 

virtual Player* makePlayerl) ! 
return new KungFuGuy; 

1 

virtual Obstacle* makeObstacle () | 



class GameEnvironment { 
GameElementFactory* gef ; 
Player* p; 
Obstacle* ob; 
public: 

GameEnvironment (GameElementFactory* factory) 

gef (factory), p (f actory->makePlayer ( ) ), 

ob (f actory->makeObstacle ( ) ) { 1 
void play ! 

p->interactWith ( ob ) ; 
} 
-GameEnvironment () ! 

delete ob; 
delete gef; 



nt mainO { 

gl (new KittiesAndPuzzles), 
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g2 (new KillAndDi smember ) ; 
gl.play () ; 
g2.play () ; 
) III:- 

In this environment, Player objects interact with Obstacle objects, but there are different 
types of players and obstacles depending on what kind of game you're playing. You 
determine the kind of game by choosing a particular GameElementFactory. and then the 
GameEnvironment controls the setup and play of the game. In this example, the setup and 
play is very simple, but those activities (the initial conditions and the slate change) can 
determine much of the game's outcome. Here, GameEnvironment is not designed to be 
inherited, although it could very possibly make sense to do that. 



This also contains examples of Double Dispatching and the Factory Method, both of which 
will be explained later. 



Virtual constructors 



I.yii.H," 

occurs). S on dim es tbis is i« ),¥ ird. For euni pit, in tlit Shape program it seems logical that 
inside the constructor for a Shape object, you would want to set everything up and then 
draw( ) the Shape. draw( ) should be a virtual function, a message to the Shape that it should 
draw itself appropriately, depending on whether it is a circle, square, line, and so on. 
However, this doesn't work inside the constructor, for the reasons given in Chapter XX: 
Virtual functions resolve to the "local" function bodies when called in constructors. 

If you want to be able to call a virtual function inside the constructor and have it do the right 
thing, you must use a technique to simulate a virtual constructor (which is a variation of the 
Factory Method). This is a conundrum. Remember the idea of a virtual function is that you 
send a message to an object and let the object figure out the right thing to do. But a 
constructor builds an object. So a virtual constructor would be like saying, "I don't know 
exactly what type of object you are, but build yourself anyway." In an ordinary constructor, 
the compiler must know which VTABLE address to bind to the VPTR, and if it existed, a 
virtual constructor couldn't do this because it doesn't know all the type information at 
compile-time. It makes sense that a constructor can't be virtual because it is the one function 
that absolutely must know everything about the type of the object. 

And yet there are times when you want something approximating the behavior of a virtual 



In the Shape example, it would be nice to hand the Shape constructor some specific 
information in the argument list and let the constructor create a specific type of Shape (a 
Circle, Square) with no further intervention. Ordinarily, you'd have to make an explicit c; 
to the Circle, Square constructor yourself. 
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Coplien^^ calls his soluiion to this problem "envelope and letter classes." The "envelope" 
class is the base class, a shell that contains a pointer to an object of the base class. The 
constructor for the "envelope" determines (at runtime, when the constructor is called, not at 
compile-time, when the type checking is normally done) what specific type to make, then 
creates an object of that specific type (on the heap) and assigns the object to its pointer. All 
the function calls are then handled by the base class through its pointer. So the base class is 
acting as a proxy for the derived class: 



//: C09:VirtualCon3tructor.cpp 




nclude <iostream> 






nclude <string> 






nclude <exception> 




u 


ing namespace std; 




class Shape { 






Shape* s; 






// Prevent copy-construction S ope 




Shape (Shapes) ; 






Shape operator- (Shapes) 




P 


otected: 






Shape { s = 0; }; 




P 


blic: 






virtual void draw ( ) { s 


->drawl); 1 




virtual void erase () { 


->erase () ; 




virtual void test () { s 


->testl); 1 




virtual -Shape { 






cout « "-ShapeXn"; 






ifls) { 






cout « "Making virtual call: 




s->erase () ; // Virt 


lal call 




cout « "delete s: "; 






delete s; // The poly 


norphic del 




class BadShapeCreation 


public ex 




string reason; 






public: 






BadShapeCreation (stri 


ig type) { 



:s O. Copiien. Admiiced C++ Programming Styles and Idioms, Addison -Wesley, 1992. 
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type) throw (BadShapeCn 



class Circle : public Shape { 

Circle (Circles) ; 

Circle operator- (Circles ) ; 

CircleO |] // Private constructor 

friend class Shape; 
public: 

void drawO I cout « "Circle :: draw\n" ; } 

void erase | cout « "Circle :: erase\ii" ; } 

void test { drawO ; 1 

-CircleO { cout << " Circle :: ~Circle\n" ; 1 

}; 

class Square : public Shape | 

Square (Squares) ; 

Square operator- ( Squares ) ; 

Square {] 

friend class Shape; 
public: 

void drawO I cout « "Square :: draw\n" ; ] 

void eraseO | cout « "Square :: erase\n" ; } 

void test I drawO; ] 

-Square() { cout << " Square ::- Square\n" ; 1 



Shape: :Shape (string type) 

throw(Shape: : BadShapeCreation ) { 
if(type == "Circle") 

s = new Circle; 
else if (type == "Square") 

s = new Square; 
else throw BadShapeCreation ( type ) ; 
drawO; // Virtual call in the co. 



ar* shlist [] = { "Circle", "Square", "Squa 
"Circle", "Circle", "Circle", "Square", "" 
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ector<Shape'-> shapes; 

ry ! 
for (char'-'- cp = shlist; **cp; cp + +) 
shapes .push_back (new Shape {''cp) ) ; 
catch (Shape: :BadShapeCreation e) { 
cout « e.whatl) « endl; 
return 1 ; 



for lint i = 0; i < shapes 
shapes [i]->drawl); 
cout « "testXn"; 

cout « "end testXn"; 
shapes[i]->erase(); 

} 

Shape c ("Circle") ; // Cre 
cout « "destructor calls 
for (int j = 0; j < shapes 
delete shapes! j] ; 



< endl; 
el); j++) { 



The base class Shape 
When you build a "virtual 
pointer is always initialized 



pointer to an object of type Shape as its only data member, 
scheme, you must exercise special care to ensure this 



Each time you derive a new subtype from Shape, you must go back and add the creation for 
that type in one place, inside the "virtual constructor" in the Shape base class. This is not too 
onerous a task, but the disadvantage is you now have a dependency between the Shape class 
and all classes derived from it (a reasonable trade-off, it seems). Also, because it is a proxy, 
the base-class interface is truly the only thing the user sees. 

In this example, the information you must hand the virtual constructor about what type to 
create is very explicit: It's a string that names the type. However, your scheme may use other 
information — for example, in a parser the output of the scanner may be handed to the virtual 
constructor, which then uses thai information to determine which token to create. 

The virtual constructor Sliape(type) can only be declared inside the class; it cannot be defined 
until after all the derived classes have been declared. However, the default constructor can be 
defined inside class Shape, but it should be made protected so temporary Shape objects 
cannot be created. This default constructor is only called by the constructors of derived-class 
objects. You are forced to explicitly create a default constructor because the compiler will 
create one for you automatically only if there are no constructors defined. Because you must 
define Shape(lype), you must also define Shape( ). 
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The default constructor in this scheme has at least one very important chore — it must set the 
value of the s pointer to zero. This sounds strange at first, but remember that the default 
constructor will be called as part of the construction of the actual object- in Coplien's terms, 
the "letter," not the "envelope." However, the "letter" is derived from the "envelope," so it 
also inherits the data member s. In the "envelope," s is important because it points to the 
actual object, but in the "letter," s is simply excess baggage. Even excess baggage should be 
initialized, however, and if s is not set to zero by the default constructor called for the "letter," 
bad things happen (as you'll see later). 

The virtual constructor takes as its argument information that completely determines the type 
of the object. Notice, though, that this type information isn't read and acted upon until 
runtime, whereas normally the compiler must know the exact type at compile-time (one other 
reason this system effectively imitates virtual constructors). 

Inside the virtual constructor there's a switch statement that uses the argument to construct 
the actual ("letter") object, which is then assigned to the pointer inside the "envelope." At that 
point, the construction of the "letter" has been completed, so any virtual calls will be properly 
directed. 

As an example, consider the call todraw() inside the virtual constructor. If you trace this call 
(either by hand or with a debugger), you can see that it starts in the draw( ) function in the 
base class. Shape. This function calls draw( ) for the "envelope" s pointer to its "letter." All 
types derived from Shape share the same interface, so this virtual call is properly executed, 
even though it seems to be in the constructor. (Actually, the constructor for the "letter" has 
already completed.) As long as all virtual calls in the base class simply make calls to identical 
virtual function through the pointer to the "letter," the system operates properly. 

To understand how it works, consider the code in iiiain( ). To fill the vector shapes, "virtual 
constructor" calls are made to Shape. Ordinarily in a situation like this, you would call the 
constructor for the actual type, and the VPTR for that type would be installed in the object. 
Here, however, the VPTR used in each case is the one for Shape, not the one for the specific 
Circle, Square, or Triangle. 

In the for loop where the draw( ) and erase( ) functions are called for each Shape, the virtual 
fiinction call resolves, through the VPTR, to the corresponding type. However, this is Shape 
in each case. In fact, you might wonder why draw( ) and erase( ) were made virtual at all. 
The reason shows up in the next step: The base-class version of draw( ) makes a call, through 
the "letter" pointer s, to the virtual function draw{ ) for the "letter." This time the call 
resolves to the actual type of the object, not just the base class Shape. Thus the runtime cost 
of using virtual constructors is one more virtual call every time you make a virtual function 
call. 

In order to create any function that is overridden, such as draw( ), erase( ) or test( ), you must 
proxy all calls to the s pointer in the base class implementation, as shown above. This is 
because, when the call is made, the call to the envelope's member function will resolve as 
being to Shape, and not to a derived type of Shape. Only when you make the proxy call to s 
will the virtual behavior take place. In iiiain( ), you can see that everything works correctly, 
even when calls are made inside constructors and destructors. 
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Destructor operation 



The activities of destruction in this scheme are also tricky. To understand, let's verbally walk 
through what happens when you call delete for a pointer to a Shape object — specifically, a 
Square — created on the heap. (This is more complicated than an object created on the stack.) 
This will be a delete through the polymorphic interface, as in the statement delete shapes[i] 

The type of the pointer sliapes[i] is of the base class Shape, so the compiler makes the call 
through Shape. Normally, you might say that it's a virtual call, so Square's destructor will be 
called. But with the virtual constructor scheme, the compiler is creating actual Shape objects, 
even though the constructor initializes the letter pomter to a specific type of Shape. The 
virtual mechanism is used, but the VPTR inside the Shape object is Shape's VPTR, not 
Square's. This resolves to Shape's destructor, which calls delete for the letter pointer s, 
which actually points lo a Square object. This is again a virtual call, but this time it resolves 
to Square's destructor. 

With a destructor, however, C++ guarantees, via the compiler, that all destructors in the 
hierarchy are called. Square's destructor is called first, followed by any intermediate 
destructors, in order, until finally the base-class destructor is called. This base-class destructor 
has code that says delete s. When this destructor was called origmally, it was for the 
"envelope" s, but now it's for the "letter" s, which is there because the "letter" was inherited 
from the "envelope," and not because it contains anything. So this call to delete should do 
nothing. 

The solution to the problem is to make the "letter" s pointer zero. Then when the "letter" 
base-class destructor is called, you get delete 0, which by definition does nothing. Because 
the default constructor is protected, it will be called only during the construction of a "letter," 
so that's the only situation where s is set to zero. 

Your most common tool for hiding construction will probably be ordinary factory methods 
rather than the more complex approaches. The idea of adding new types with minimal effect 
on the rest of the system will be further explored later in this chapter. 



Callbacks 
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Functor/Command 

Strategy 

Observer 



Like the other forms of callback, this contains a hook point where you can change code. The 
difference is in the observer's completely dynamic nature. It is often used for the specific case 
of changes based on other object's change of state, but is also the basis of event management. 
Anytime you want to decouple the source of the call from the called code in a completely 
dynamic way. 

The observer pattern solves a fairly common problem: What if a group of objects needs to 
update themselves when some olher object changes state? This can be seen in the "model- 
view" aspect of Smalltalk's MVC (mode 1-v lew -confr oiler), or the almost-equivalent 
"Document- View Architecture." Suppose that you have some data (the "documenf ) and 
more than one view, say a plot and a textual view. When you change the data, the two views 
must know to update themselves, and that's what the observer facilitates. 

There are two types of objects used to implement the observer pattern in the following code. 
The Observable class keeps track of everybody who wants to be informed when a change 
happens, whether the "state" has changed or not. When someone says "OK, everybody should 
check and potentially update themselves," the Observable class performs this task by calling 
the notifyObservers( ) member function for each observer on the list. The 
notify Observers( ) member function is part of the base class Observable. 

There are actually two "things that change" in the observer pattern: the quantity of observing 
objects and the way an update occurs. That is, the observer pattern allows you lo modify both 
of these without affecting the surrounding code. 

There are a number of ways to implement the observer paltern, but the code shown here will 
create a framework from which you can build your own observer code, following the 
exan:q>le. First, this interface describes what an observer looks like: 

//: CO 9: Observer. h 
// The Observer interface 
lifndef OBSERVER_H 
#define OBSERVER_H 



public: 

// Called by the obs. 
// the observed obje. 
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#endif // OBSERVER_H III:- 

Since Observer inleracts with Observable in lliis approacli, Observable must be declared 
first. In addition, ttie Ailment class is empty and only acts as a base class for any type of 
argument you wish io pass during an update. If you want, you can simply pass the extra 
argument as a void*; you'll have to downcast in either case but some folks find void* 
objectionable. 

Observer is an "interface" class that only has one member function, update( ). This function 
is called by the object that's being observed, when that object decides its time to update all 
it's observers. The arguments are optional; you could have an npdate( ) with no arguments 
and that would still fit the observer pattern; however this is more general - it allows the 
observed object to pass the object that caused the update (since an Observer may be 
registered with more than one observed object) and any extra information if that's helpful, 
rather than forcing the Observer object to hunt around to see who is updating and to fetch any 
other information it needs. 

The "observed object" that decides when and how to do the updating will be called the 
Observable: 

//: C09:Observable.h 
// The Observable class 
#ifndef OBSERVABLE_H 
#define OBSERVABLE_H 
#include "Observer. h" 

class Observable { 

bool changed; 

std: :set<Observer'-> observers; 
protected: 

virtual void setChanged () { changed = true; 1 

virtual void clearChanged () { changed = false; } 
public: 

virtual void addObserver ( Observer 6 o) { 

1 

virtual void deleteObserver ( Observer 6 o) { 

1 

virtual void deleteObserver s ( ) { 

observers. clear 1) ; 
1 
virtual int countObserver s ( ) { 
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s 1) ; 



virtual bool hasChanged () { return 
// If this object has changed, not 

virtual void notif yObservers (Argum 
if 1 ! hasChanged ( ) ) return; 
clearChangedO ; // Not "changed" 
std: :set<Observer*>: [iterator it 
forlit = observers. beginO ; 
it != observers. end(); it++) 
l'-it)->update (this, arg); 



changed; } 

fy all 



ndif 



'BSERVABLE_H ///:- 



Again, the design here is more elaborate tlian is necessary; as long as there's a way to register 
an Observer with an Observable and for the Observable to update its Observers, the set of 
member functions doesn't matter. However, this design is intended io be reusable (it was 
lifted from the design used in the Java standard library). As mentioned elsewhere in the book, 
there is no support for multithreading in the Standard C++ libraries, so this design would need 
to be modified in a multithreaded environment. 

Observable has a flag to indicate whether it's been changed. In a simpler design, there would 
be no flag; if something happened, everyone would be notified. The flag allows you to wait, 
and only notify the Observers when you decide the time is right. Notice, however, that the 
control of the flag's state is protected, so that only an inheritor can decide what constitutes a 
change, and not the end user of the resulting derived Observer class. 

The collection of Observer objects is kept in a set<Observer*> to prevent duplicates; the set 
insert( ), erase( ). clear( ) and size( ) functions are exposed to allow Observers to be added 
and removed at any time, thus providing runtime flexibility. 



the 



Most of the work is done in notifyObservers( ). If the changed flag has not been si 
does nothing. Otherwise, it first clears the changed flag so repeated calls to 
notify Observers( ) won't waste time. This is done before notifying the observers ir 
calls to update( ) do anything that causes a change back to this Observable object. Then it 
moves through the set and calls back to the update( ) member function of each Observer. 

At first it may appear that you can use an ordinary Observable object to manage the updates. 
But this doesn't work; to get an effect, you iniis! inherit from Observable and somewhere in 
your derived-class code call setClianged( ). This is the member function that sets the 
"changed" flag, which means that when you call notify Observers{ ) all of the observers will, 
in fact, get notified. Where you call setChanged( ) depends on the logic of your program. 

Now we encounter a dilemma. An object that should notify its observers about things that 
happen to it - events or changes in state - might have more than one such item of interest. For 
exan:q>le, if you're dealing with a graphical user interface (GUI) item — a button, say — the 
items of interest might be the mouse clicked the button, the mouse moved over the button, and 
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(for some reason) the button changed its color. So we'd like to be able to report all of these 
events to different observers, each of which is interested in a different type of event. 

The problem is that we would normally reach for multiple inheritance in such a situation: "I'll 
inherit from Observable to deal with mouse clicks, and I'll ... er ... inherit from Observable 
to deal with mouse-overs, and, well, ... hram, that doesn't work." 



The "interface" idiom 
The "inner class" idiom 

Here's a situ arion where we do actually need to (in effect) upcast to more than one type, but in 
this case we need to provide several iij^t'ent implementations of the same base type. The 
solution is something I've lifted from Java, which takes C++'s nested class one step further. 
Java has a built-in feature called inner classes, which look like C-H-'s nested classes, but they 
do two other things: 

1. A Java inner class automatically has access to the private elements of the class it is nested 



2. An object of a Java inner class automatically grabs the "this" to the outer class object i 
was created within. In Java, the "outer this" is implicitly dereferenced whenever you 
name an element of the outer class. 

[[ Insert the definition of a closure ]]. So to implement the inner class idiom in C-M-, we mi 
do these things by hand. Here's an example: 



II 


C09:In 


lerClas 


II 


Example 


of the 


#i 


iclude < 


Lostrea 


#i 


iclude < 


3tring> 


us 


ing name 


space s 


cl 


ass Poin 


gable { 


public: 






ifirtual 


^oid po 



old callPoing (PoingableS p) { 
p.poing ; 



class Bingable { 

virtual void bing ( ) = 0; 



id callBing (Bi. 
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class Outer { 
string name; 

// Define one inner class: 
class Innerl; 

friend class Outer :: Inner 1 ; 
class Innerl : public Poingable { 

Outer* parent; 
public: 

Innerl (Outer* p) : parent (p) {} 
void poingl) { 

cout « "poing called for " 

« parent->name « endl ; 
// Accesses data in the outer class obje. 
1 
1 innerl; 

// Define a second inner class: 
class Inner2; 

friend class Outer :: Inner2 ; 
class Inner2 : public Bingable { 

Outer* parent; 
public: 

Inner2 (Outer* p) : parent (p) {} 
void bingO { 

cout « "bing called for " 
« parent->name « endl; 
1 
1 inner2; 
public: 

Outer (const strings nm) : name (nm) , 

innerl (this), inner2(this) {) 
// Return reference to interfaces 
// implemented by the inner classes: 
operator PoingableS ( ) | return innerl; } 
operator BingableSO { return inner2 ; 1 



nt mainO { 
Outer xC'Ping Pong"); 

// Like upcasting to multiple base type 
callPoing (x) ; 

callBing (x) ; 
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I 1 III:- 

The example begins wilh the Poingable and Singable interfaces, each of which contain a 
single member function. The services provided by callPoing( ) and caUBing( ) require that 
the object they receive iiiq>lement the Poingable and Singable interfaces, respectively, but 
they put no other requirements on that object so as to maximize the flexibility of using 
callPoingO and callBing(). Note the lack of virtual destructors in either interface - the 
intent is that you never perform object destruction via the interface. 

Outer contains some private data (name) and it wishes to provide both a Poingable interface 
and a BlDgable interface so it can be used with caUFoing() and callBing(). Of course, in this 
situation we couM simply use multiple inheritance. This example is just intended to show the 
simplest syntax for the idiom; we'll see a real use shortly. To provide a Poingable object 
without inheriting Outer from Poingable, the inner class idiom is used. Fii^t, the declaration 
class Iimer says that, somewhere, there is a nested class of this name. This allows the friend 
declaration for the class, which follows. Finally, now that the nested class has been granted 
access to all the private elements of Outer, the class can be defined. Notice that it keeps a 
pointer to the Outer which created it, and this pointer must be initialized in the constructor. 
Finally, the poing( ) function from Poingable is implemented. The same process occurs for 
the second inner class which implements Bingabie. Each inner class has a single private 
instance created, which is initialized in the Outer constructor. By creating the member objects 
and returning references to them, issues of object lifetime are eliminated. 

Notice that both inner class definitions are private, and in fact the client programmer doesn't 
have any access to details of the implementation, since the two access methods operator 
Foingable&( ) and operator Bingable&( ) only return a reference to the upcast interface, not 
to the object that implements it. In fact, since the two inner classes are private, the client 
programmer cannot even downcast to the implementation classes, thus providing complete 
isolation between interface and implementation. 

Just to push a point, I've taken the extra liberty hereof defining the automatic type conversion 
operators operator Poingable&( ) and operator Bingabie&( ). In niain( ), you can see that 
these actually allow a syntax that looks like Outer is multiply inherited from Poingable and 
Bingabie. The difference is that the casts in this case are one way. You can get the effect of 
an upcast to Poingable or Bingabie, but you cannot downcast back to an Outer. In the 
following example of observer, you'll see the more typical approach: you provide access to 
the inner class objects using ordinary member functions, not automatic typeci 
operations. 



The observer example 



Arm ed with tbe Observer and Observable header files and the ir 
look at an example of the observer pattein: 



// 


C09:Ob 


// 


De 


non 


3t 


#i 


icl 


ide 




#i 


icl 


jde 


< 


#i 


icl 


jde 


< 
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#iiiclude <algorithra> 
#incliide <string> 
using namespace std; 

class Flower { 
bool isOpen; 
public: 

Flower : isOpen ( f al se ) , 

openNotifier (this) , closeNotif ier (this) { 1 
void openO I // Opens its petals 
isOpen = true; 

openNotifier .notify Ob servers ( ) ; 
CloseNotif ier. open () ; 
) 

void close I // Closes its petals 
isOpen = false; 

closeNotifier. notify Observers ( ) ; 
openNotifier. close () ; 
1 

// Using the "inner class" idiom: 
class OpenNotifier; 

friend class Flower :: OpenNotifier ; 
class OpenNotifier : public Observable | 
Flower* parent; 
bool alreadyOpen; 
public: 

OpenNotifier (Flower* f) : parent (f), 

alreadyOpen (false) {} 
void notifyObservers (Argument* arg = ) | 
if (parent->isOpen SE ! alreadyOpen ) | 
setChangedO ; 

Observable: : notif y Observer s ( ) ; 
alreadyOpen = true; 



void close { alreadyOpen = false; ) 
) openNotifier; 
class CloseNotifier; 

friend class Flower :: CloseNotif ier ; 
class CloseNotifier : public Observable | 

Flower* parent; 

bool alreadyClosed; 
public: 

CloseNotifier (Flower* f ) : parent (f ) , 
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alreadyClosed (false) { 1 

id notifyObservers (Argument* arg=0) { 

if ( !parent->isOpeii EE ! already Closed) { 

setChangedO ; 

Observable: : notifyObservers ( ) ; 

alreadyClosed = true; 



void openO { alreadyClosed = fals 
} closeNotifier; 



// An "inner class" for observing openings: 

class OpenObserver; 

friend class Bee :: OpenObserver ; 

class OpenObserver : public Observer { 

public: 

OpenObserver (Bee* b) : parent (b) {] 
void update (Observable*, Argument *) | 

« "'s breakfast time!\n"; 
} 
} openObsrv; 
// Another "inner class" for closings: 

friend class Bee : : CloseObserver ; 

class CloseObserver : public Observer { 

public: 

CloseObserver (Bee* b) : parent (b) {] 
void update (Observable*, Argument *) ! 

« "'s bed time! \n"; 

} 

public: 

Bee (string nm) : name (nm) , 

openObsrv (this) , closeObsrv (this ) { } 
Observers OpenObserver ( ) | return openObsrv; } 
Observers CloseObserver ( ) | return closeObsrv;} 

); 
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class Hummingbird { 
string name; 
class OpenObserver; 

friend class Hummingbird :: OpenObserver ; 
class OpenObserver : public Observer { 

Hummingbird'- parent; 
public: 

OpenObserver (Hummingbird* h) : parent (h) { } 
void update (Observable*, Argument *) { 
cout << "Hummingbird " << parent->name 
« "'s breakfast time!\n"; 
1 
1 openObsrv; 

friend class Hummingbird :: CloseObserver; 

class CloseObserver : public Observer | 
Hummingbird* parent; 

public: 

CloseObserver (Hummingbird* h) : parent (h) { 1 
void update (Observable*, Argument *) { 

« "'s bed time! \n" ; 
1 

public: 

Hummingbird (string nm) : name (nm) , 

openObsrv (this) , closeObsrv (this ) | } 
Observers OpenObserver ( ) | return openObsrv; ] 
Observers CloseObserver ( ) | return closeObsrv;} 



Bee ba ("A") , bb ("B") ; 
Hummingbird ha("A"), hb("B"); 
f . openNotif ier . addObserver (ha . openO; 
f . openNotif ier . addObserver (hb . openOI 
f . openNotif ier . addObserver (ba . openOI 
f . openNotif ier . addObserver (bb . openOI 
f . closeNotif ier . addObserver (ha . clos. 
f . closeNotif ier . addObserver (hb . clos. 
f. closeNotif ier. addObserver (ba.clos. 
f .closeNotif ier .addObserver (bb.closi 
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// Hu 


mmingbird B decides to sleep in: 


f .ope 


nNotifier.deleteObserver (hb . openObser 


// So 


mething changes that interests observ 


f .ope 


n(); 


f .ope 


n(); // It's already open, no change. 


// Be 


e A doesn't want to go to bed: 


f .clo 


seNotifier.deleteObserver ( 


ba. 


closeObserver ( ) ) ; 


f .clo 


se ( ) ; 


f .clo 


se(); // It's already closed; no chan 


f .ope 


nNotifier.deleteObservers () ; 


f .ope 


nO; 


f .clo 


se ( ) ; 


} ///:- 





The events of interest are that a Flower can open or close. Because of the use of the inner 
class idiom, both these events can be separately-observable phenomena. OpenNotifier and 
CloseNotifier both inherit Observable, so they have access to setChanged( ) and can be 
handed to anything that needs an Observable. You'll notice that, contrary to 
InnerClassIdiom.cpp, the Observable descendants are public. This is because some of their 
member functions must be available to the client programmer. There's nothing that says that 
an inner class must be private; in InnerClassIdiom.cpp I was simply following the design 
guideline "make things as private as possible." You could make the classes private and 
expose the appropriate methods by proxy in Flower, but it wouldn't gain much. 

The inner class idiom also comes in handy to define more than one kind of Observer, in Bee 
and Hummingbird, since both those classes may want to independently observe Flower 
openings and closings. Notice how the inner class idiom provides something that has most of 
the benefits of inheritance (the ability to access the private data in the outer class, for 
example) without the same restrictions. 

In ■iiain( ), you can see one of the prime benefits of the observer pattern: the ability to change 
behavior at runtime by dynamically registering and un-registering Observers with 
Observables. 

If you study the code above you'll see that OpenNotifler and CloseNotifier use the basic 
Observable interface. This means that you could inherit other completely different Observer 
classes; the only connection the Observers have with Flowers is the Observer interface. 



Multiple dispatching 



It it \i\t [( ;i) Nnmber + Number, Namber * Number, etc., where Number is the base 
class for a family of numerical objects. But when you say a +b, and you don't know the exact 
type of either a or b, so how can you get them to interact properly? 
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The answer starts with something you probably don't think about: C++ performs only single 
dispatching. That is, if you are performing an operation on more than one object whose type is 
unknown, C++ can invoke the dynamic binding mechanism on only one of those types. This 
doesn't solve the problem, so you end up detecting some types manually and effectively 
producing your own dynamic binding behavior. 

The solution is called multiple dispatching. Remember that polymorphism can occur only via 
member function calls, so if you want double dispatching to occur, there must be two member 
function calls: the first to determine the first unknown type, and the second to determine the 
second unknown type. With multiple dispatching, you must have a virtual call to determine 
each of the types. Generally, you'll set up a configuration such that a single member function 
call produces more than one dynamic member function call and thus determines more than 
one type in the process. To get this effect, you need to work with more than one virtual 
fiinction: you'll need a virtual function call for each dispatch. The virtual functions in the 
following example are called conipete( ) and eval( ), and are both members of the same type. 
(In this case there will be only two dispatches, which is referred to as double dispatching). If 
you are working with two different type hierarchies that are interacting, then you'll have to 
have a virtual call in each hierarchy. 

Here's an example of multiple dispatching: 

// : C0 9:PaperScissorsRock.cpp 

// Demonstration of multiple dispatching 

#include ". . /purge. h" 

#include <iostream> 

#include <vector> 

#include <algorithm> 

#include <cstdlib> 

#include <ctime> 

using namespace std; 

class Rock; 

enum Outcome { win, lose, draw 1; 



switch (out) { 
default: 
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class Item { 
public: 

virtual Outcome compete {const Item*) = 0; 

virtual Outcome eval {const Paper*) const = ; 

virtual Outcome eval {const Scissors*) const= 0; 

virtual Outcome eval (const Rock*) const = ; 

virtual ostreamfi print (ostreamfi os ) const = ; 

virtual -Iteml) {] 

friend ostreamS 

operator« {ostreamfi os, const Item* it) | 
return it->print (os) ; 



class Paper : public Item { 
public: 

Outcome compete (const Item* it) { 

return it->eval (this) ; 
} 
Outcome eval (const Paper*) const ! 

return draw; 
} 



Outcome eval (const Rock*) const { 
return lose; 



rn OS « "Paper 



public: 

Outcome compete (const Item* it) { 

return it->eval (this) ; 
1 
Outcome eval (const Paper*) const { 

return lose; 
} 
Outcome eval (const Scissors*) const 

return draw; 
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Outcome eval (const Rock*) const { 



ma print (ostreama os ) const { 



class Rock : public Item { 
public: 

Outcome compete (const Item* it) { 
return it->eval ( thi s ) ; 

} 

Outcome eval (const Paper*) const | 



Outcome eval (const Scissors*) const { 
return lose; 



Outcome eval (const Rock*) const { 
return draw; 



ma print (ostreama os) const { 
rn OS << "Rock "; 



); 



struct ItemGen | 

ItemGen () | srand (time (0) ) ; 
Item* operator () () { 

switch(rand() % 3) { 
default: 

return new Scissors; 

case 2: 

return new Rock; 



truct Compete { 
Outcome operator ( ) (Item* a. Item* b) { 
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return a->compete (b ) ; 
} 

); 

int mainO | 

const int sz = 20; 
vector<Iteni'-> vl3z'-2); 

generate (v. begin () , v.endl) , ItemGen ( ) ) ; 
transform(v. begin 1) , v.beginl) + sz, 
v.beginO + sz. 

Compete ( ) ) ; 
purge (v) ; 
} III:- 



Visitor, a type of multiple dispatching 

Tie 1! ill It p lio 1 is ill 1 1 villi 1 iv ( 1 prim i\\ cli si t ienrcl i ill ii ii fii t J ; peril ip s ii's fto m 
ago llier vendor and yoii cin't id ite dunces lo tb il li icnrcli y . H o w tier, y o ii 'd like lo idd i 
polyn orpbk lelbtds lo tb il b icrarchy , w li icb i eam lb an on ally you'd bue tn add 
son tibiae to the bast clan in terfice . S o tbc dileni a is tbit yoi led to add n ethods lo tli 
base diss, bolyoD can 'Mo neb Ihe bise class. H i>w do you ;et around Ibis? 

Tbe design palters tb al solve s tli is t iid ol rroblem is called i "visitor" jllie fin il one in Ibe 
Design Patterns book), and it builds on the double dispatching scheme shown in Ihe last 
section. 

The visitor pattern allows you to extend the interface of the primary type by creating a 
separateclasshierarchy of type Visitor to virlualize the operations performed upon the 
primary type. The objectsof the primary type simply "accept" the visitor, then call the 
visitor's dynamically-bound member function. 

// : C0 9:BeeAndFlowers . cpp 

// Demonstration of "visitor" pattern 

#include ". . /purge. h" 

#include <iostream> 

#include <string> 

#include <vector> 

#include <algorithm> 

linclude <cstdlib> 

linclude <ctime> 

using namespace std; 
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class Chrysanthemum; 

public: 

virtual void visit (Gladiolus * f) = ; 
virtual void visit (Renuculus * f) = ; 
virtual void visit (Chrysanthemum* f) 

virtual -Visitorl) {] 



class Flower { 
public: 

virtual void accept (Vi s itorS ) = ; 

virtual -Flower 11 {] 



class Gladiolus : public Flower { 
public: 

virtual void accept (Vi s itorS v) { 
v.visit(this); 

} 

); 

class Renuculus : public Flower | 
public: 

virtual void accept (Visitors v) | 



class Chrysanthemum : public Flower { 
virtual void accept (Vi s itorS v) { 
1 

}; 

// Add the ability to produce a string 
class StringVal : public Visitor { 

public: 

operator const strings ( ) { return s; 
virtual void vi s it ( Gladiolus * ) { 
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= "Gladiolus 



tual void visit (Renuculus*) { 



tual void visit (Chrysanthemum* ) { 
= "Chrysanthemum"; 



// Add the ability to do "Bee" activi 
class Bee : public Visitor { 
public: 

virtual void visit ( Gladiolus *■ ) { 
cout « "Bee and Gladiolus \n" ; 
1 

virtual void visit (Renuculus * ) { 
cout << "Bee and Renuculus \n" ; 
1 
virtual void vis it ( Chrysanthemum* ) 

cout << "Bee and Chrysanthemum\n" 
} 

); 

struct FlowerGen | 

FlowerGenO | srand ( time ( ) ) ; ) 
Flower* operator () () { 
switchlrandl) % 3) { 
default: 

^turn new Gladiolus; 
^turn new Renuculus; 
^turn new Chry santhemui 



nt mainl) { 
vector<Flower*> vllO); 

generate(v. begin 1) , v.endl), FlowerGenO); 

vector<Flower*>: :iterator it; 

// It's almost as if I added a virtual func 

// to produce a Flower string representatio 

StringVal sval; 

forlit = v.beginl); it != v.endl); it++) { 
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l*it)->accept (sval) ; 
cout « string Isval) « endl ; 
) 

// Perform "Bee" operation on all Flowers 
Bee bee; 
fordt = v.beginl); it != v.endO; it + +) 

(•lt)->acceptlbee),- 
purge (V); 
1 ///:■■ 



Efficiency 

Flyweight 

The composite 

Evolving a design: the trash 
recycler 

DidnnliH iilD i liiilt lii, ;t lit ;pi(ilit \\\i lifiii ilioi is lo ;I. I n li In, ih i if t(ill( 
tipt 111! [i HID I I iilh [iditrij lg pifiii) nil ih liisL li lit ii itii I si In lit i , I 11 I 
(diidiltd ii [ iiipltr U ) \\ ntd. 

I III ii lol I liiuil dtiip hiMDSi ii lin 31 iddtj igDUiiiDl. nil's > hi g il i s il id It ii nil 

[Diiliiiil ii lilt lit linl nriits II Hi Irnl rtf((lii! pliil ill i iiiJ lijtilti. lit [[ifni 
I nil Hil lit SHliif tl llil liiil. T hi; \\ i lirt I I T I i«i w ii: in li i v t i U i tii tl 
ID till gii pittts dI liiil, iDi lit \n\\\i llfirti til tunlj j 111 Ijpi H q nt, 

(I Dt Dlllt Dhjinlvit Df III! pgjriii i' i> iin U Itt > lijiil nd iilit gf lit d ilfi it d I I) p i : 
D Ml Ml. lit III si < III hi h^l Id I Mil Dill I!) d ilfi it d I I) p m o l| lo d li li t n , so il i ihs stDS 
K Iti plill!! Hi -jgi 1 iliti" (iitligg gg lit tgDiiiDiilglili. iliissii iD[ li'Hiiliiit' 

I // : C09: sumValue.h 

// Sums the value of Trash in any type of STL 
I // container of any specific type of Trash: 
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#ifndef SUMVALUE_H 
#define SUMVALUE_H 
#incliide <typeiiifo> 
#incli]de <vector> 

template<typename Cont> 
void sumValue (const ContS bin) { 
double val = O.Of; 

typename Cont :: iterator tally = bin . begin () ; 
while(tally != bin. end!)) { 

val +=(*tally)->weight 1) * ( '"tally ) ->value () ; 
out « "weight of " 

« typeidl* C-tally) ) .name 1) 
« ■■ = ■■ « ('■tally)->weight 1) 
« endl; 
tally++; 
1 

out « "Total value = " « val « endl ; 
1 
#endif // SUMVALUE_H / / / : - 

When you look at a piece of code like this, it can be initially disturbing because you might 
wonder "how can the compiler know that the member functions I'm calling here are valid?" 
But of course, all the template says is "generate this code on demand," and so only when you 
call the function will type checking come into play. This enforces that *tally produces an 
object that has member functions weight( ) and value( ), and that out is a global ostream. 

The suinValue( ) function is templatized on the type of container that's holding the Trash 
pointers. Notice there's nothing in the template signature that says "this container must 
behave like an STL container and must hold Trash*"; that is all implied in the code that's 
generated which uses the container. 

The first version of the example takes the straightforward approach: creating a 
¥ector<Trash*>, filling it with Trash objects, then using RTTI to sort them out: 

// : C0 9:Recyclel . cpp 
// Recycling with RTTI 
#include "sumValue . h" 
#include ". . /purge. h" 
#include <fstream> 
#include <vector> 
#include <typeinfo> 
#include <cstdlib> 
#include <ctime> 
using namespace std; 
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double _weight; 

static int _dcount; // # destroyed 

// assignment S copy-constructor: 
void operator- (const TrashE); 
Trash (const TrashS); 
public: 

Trash (double wt ) : _weightlwt) { 



virtual double value ( ) const = 0; 
double weight 1) const { return _weight; } 
static int count 1) { return _count; } 
static int dcountl) { return _dcount;l 
virtual -Trash 1) { _dcount + + ; } 



int Trash: :_count = ; 
int Trash: :_dcount = ; 

class Aluminum : public Trash { 

static double val ; 
public: 

Aluminum (double wt ) : Trash (wt) {] 

double value 1) const { return val ; 

static void value (double newval ) { 
val = newval; 



-Aluminum { out « "-Alui 



double Aluminum: :val = 1.67F; 

class Paper : public Trash { 

static double val ; 
public: 

Paper (double wt ) : Trash (wt) {] 

double value const { return val ; 

static void value (double newval) { 
val = newval; 

1 

-Paper { out « "-Paper\n"; ) 
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double Paper: :val = O.IOF; 

class Glass : public Trash { 

static double val ; 
public: 

Glass (double wt ) : Trash (wt) {) 

double value 1) const { return val ; 

static void value (double newval ) { 
val = newval; 

1 



double Glass: :val = 0.23F; 

public: 

TrashGen 1) { srand (time ( ) ) ; 1 
static double f rand ( int mod) { 

return static_cast<double> ( rand ( } % mod) ; 
1 
Trash*- operator () () { 

for lint i = 0; i < 3D; i + +) 
switch (randl) % 3) { 
case : 

return new Aluminum ( f rand ( 1 00 )) ; 

return new Paper ( f rand ( 1 )) ; 
case 2 : 

return new Glas s ( f rand (1 ) ) ; 

return new Aluminum (0); 
// Or throw exeception. . . 



nt mainl) { 
vector<Trash'-> bin; 
// Fill up the Trash bin: 

generate_n (back_inserter (bin) , 30, TrashGen ( ) ) ; 
vector<Aluminum*> alBin; 
vector<Paper'-> paperBin; 
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vector<Tra3h'->: : iterator sorter = bin.beginl); 

// Sort the Trash: 

while (sorter != bin. end!)) { 

Aluminum* ap = 

dynamic_cast<Aluminum*>(*sorter) ; 

Paper* pp = dynamic_ca3t<Paper '> ( *sorter ) ; 

Glass* gp = dynamic_ca3t<Glas3 *> ( *sorter ) ; 

if lap) alBin.push_back(ap) ; 

if (pp) paperBin.push_back(pp) ; 

if (gp) glassBin.push_back(gp) ; 



m 


^'alL 


elalBin 


m 


^'alL 


e (paper 


m 


^?alL 


e (glass 


m 


^?alL 


e(bin); 


t 


« 


"total 




« 


Trash: : 


r 


ge(t 


in); 



( III:- 

This uses the classic structure of virtual functions in the base class that are redefined in the 
derived class. In addition, there are two static data members in the base class: .count to 
indicate the number of Trash objects that are created, and dcount to keep track of the 
number that are destroyed. This verifies that proper memory management occurs. To support 
this, the operator= and copy-constructor are disallowed by declaring them private (no 
definitions are necessary; this simply prevents the compiler from synthesizing them). Those 
operations would cause problems with the count, and if they were allowed you'd have to 
define them properly. 

The Trash objects are created, for the sake of this example, by the generator TrashGen, 
which uses the random number generator to choose the type of Trash, and also to provide it 
with a "weight" argument. The return value of the generator's operator() is upcast to 
Trash*, so all the specific type information is lost. In iiiain( ), a vector<Tnish*> called bin 
is created and then filled using the STL algorithm generate_n( ). To perform the sorting, 
three vectors are created, each of which holds a different type of Trash*. An iterator moves 
through bin and RTTI is used to determine which specific type of Trash the iterator is 
currently selecting, placing each into the appropriate typed bin. Finally, sDmValue( ) is 
applied to each of the containers, and the Trash objects are cleaned up using pDrge( ) 
(defined in Chapter XX). The creation and destruction counts ensure that things are properly 
cleaned up. 



Of course, it seems silly to upcast the types of Trash into a container holding base type 
pointers, and then to turn around and downcast. Why not just put the trash into the appropriate 
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receptacle in the first place? (indeed, this is the whole enigma of recycling). In this program it 
might be easy to repair, but sometimes a system's structure and flexibility can benefit greatly 
from downcasting. 

The program satisfies the design requirements: it works. This may be fine as long as it's a 
one-shot solution. However, a good program will evolve over time, so you must ask: what if 
the situation changes? For example, cardboard is now a valuable recyclable commodity, so 
how will that be integrated into the system (especially if the program is large and 
complicated). Since the above type-check coding in the switch statement and in the RTTl 
statements could be scattered throughout the program, you'd have to go find all that code 
every time a new type was added, and if you miss one the compiler won't help you. 

The key to the misuse of RTTI here is that every type is tested. If you're only looking for a 
subset of types because that subset needs special freatment, that's probably fine. But if you're 
hunting for every type inside a switch statement, then you're probably missing an important 
point, and definitely makmg your code less maintainable. In the next section we'll look at 
how this program evolved over several stages to become much more flexible. This should 
prove a valuable example in program design. 



Improving the design 



ii( iiiiiiiii II Design fiiHeriij are organized around the question "What will change as this 
program evolves?" This is usually the most important question that you can ask about any 
design. If you can build your system around the answer, the results will be two-pronged: not 
only will your system allow easy (and inexpensive) maintenance, but you might also produce 
components that are reusable, so that other systems can be built more cheaply. This is the 
promise of object-oriented programming, but it doesn't happen automatically; it requires 
thought and insight on your part. In this section we'll see how this process can happen during 
the refinement of a system. 

The answer to the question "What will change?" for the recycling system is a common one: 
more types will be added to the system. The goal of the design, then, is to make this addition 
of types as painless as possible. In the recycling program, we'd like to encapsulate all places 
where specific type information is mentioned, so (if for no other reason) any changes can be 
localized inside those encapsulations. It turns out that this process also cleans up the rest of 
the code considerably. 



"Make more objects" 



THs brings ii|i i stncril objcct-oriiDted desl;D principle lliitl [inl lieird spchn by Giad 
B oocli: 'If tbe dtsljn is loo co n p llciled , ni ike id oie objetts." Ibis Is slm n Itt gf o nl) 
coggleFlnlullive ud ludicrously sli pie, igd yel It's Ibe i o st n sefg I {d Iddioe I've foggd. 
(Yob I iebt observe III it" i ike gi ore objecls' ii o fieri eq n ivikiit lo "idd inothn k lel o f 
iiidlrectioii.''j Ig generil, If y on find i plice vi itb i essy cod e, co g sId er ¥ hit sort e I diss 
w ggld cltin tblnp gp. ften llie sidi died o f clnnlgs iip tbt code « ill be i system Ibil b 
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Consider first the place where Trash objects are created. In the above example, we're 

;ntly using a generator to create the objects. The generator nicely encapsulates the 
n of the objects, but the neatness is an illusion because in general we'll want to create 
the objects based on something more than a random number generator. Some information will 
be available which will determine what kind of Trash object this should be. Because you 
generally need to make your objects by examining some kind of information, if you're not 
paymg close attention you may end up with switch statements (as in TrashGen) or cascaded 
if statements scattered throughout your code. This is definitely messy, and also a place where 
you must change code whenever anew type is added. If new types are commonly added, a 
better solution is a single member fiinction that takes all of the necessary information and 
produces an object of the correct type, already upcast to a Trash pointer. In Design Patterns 
this is broadly referred to as a creational pattern (of which there are several). The specific 
pattern that will be applied here is a variant of the Factory Method ("method" being a more 
OOPish way to refer to a member function). Here, the factory method will be a static member 
of Trash, but more commonly it is a member fiinction that is overridden in the derived class. 

The idea of the factory method is that you pass it the essential information it needs to know to 
create your object, then stand back and wait for the pointer (already upcast to the base type) to 
pop out as the return value. From then on, you treat the object polymorphic ally. Thus, you 
never even need to know the exact type of object that's created. In fact, the factory method 
hides it from you to prevent accidental misuse. If you want to use the object without 
polymorphism, you must explicitly use RTTI and casting. 

But there's a little problem, especially when you use the more complicated approach (not 
shown here) of making the factory method in the base class and overriding it in the derived 
classes. What if the information required in the derived class requires more or different 
arguments? "Creating more objects" solves this problem. To implement the factory method, 
the Trash class gets a new member function called factory( ). To hide the creational data, 
there's a new class called Info that contains all of the necessary information for the factory() 
method to create the appropriate Trash object. Here's a simple implementation of Info: 

lass Info I 
int type; 
// Must change this to add another type: 



public: 

Info (int typeNum, double dat) 

: type (typeNum % maxnum), data (dat) {] 

}; 

An Info object's only job is to hold information for thefactory() method. Now, if there's 
situation in which facto ry( ) needs more or different information to create a new type of 
Trash object, the factory( ) interface doesn't need to be changed. The Info class can be 
changed by adding new data and new constructors, or in the more typical object-oriented 
fashion of subclassing. 
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Here's the second version of the program with the factory method added. The object-counting 
code has been removed; we'll assume proper cleanup will take place in all the rest of the 
examples. 

// : C0 9:Recycle2 . cpp 

// Adding a factory method 

#incliide "sumValue . h" 

#include ". . /purge. h" 

#include <fstream> 

#include <vector> 

#include <typeinfo> 

#include <cstdlib> 

linclude <ctime> 

using namespace std; 

ofstream out ( "Recycle2 . out" ) ; 

double _weight; 

void operator- (const Trasha); 
Trash (const TrashE); 
public: 

Trash (double wt ) : _weightlwt) { } 

virtual double value ( ) const = 0; 

double weightO const { return _weight; } 

virtual -Trash {] 

// Nested class because it's tightly coupled 

// to Trash: 

class Info { 

int type; 

// Must change this to add another type: 

static const int maxnum = 3 ; 

double data; 

friend class Trash; 
public: 

Info (int typeNum, double dat) 

: type (typeNum % maxnum), data (dat) {] 
1; 
static Trash* factory ( const InfoS info); 



class Aluminum : public Trash { 

static double val ; 
public: 

Aluminum (double wt ) : Trash (wt) {] 
double value const { return val; 
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tatic void value (double 
val = newval; 



double Aluminum: :val = 1.67F; 

class Paper : public Trash { 

static double val ; 
public: 

Paper (double wt ) : Trash (wt) {] 

double value const | return val ; 

static void value (double newval) { 
val = newval; 



-Paper () { out « "~Pape 



double Paper: :val = O.IOF; 

class Glass : public Trash { 

static double val ; 
public: 

Glass (double wt ) : Trash (wt) (} 

double value const | return val ; 

static void value (double newval) { 
val = newval; 



double Glass: :val = 0.23F; 

// Definition of the factory method. It must know 
// all the types, so is defined after all the 
// subtypes are defined: 

Trash* Trash :: factory (const InfoS info) { 
switch(info.type) | 

default: // In case of overrun 

return new Aluminum ( info . data ) ; 

return new Paper ( info . data ) ; 
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new Glass (info. data) ; 



) 



// Generator for Info objects: 
class InfoGen | 

int typeQuantity; 
int maxWeight; 
public: 

InfoGen (int typeQuant, int maxWt) 

: typeQuantity (typeQuant ) , maxWeight (maxWt ) { 
srand(time (0) ) ; 
1 
Trash::Info operator()() { 

return Trash :: Info (rand () % typeQuantity, 
static_cast<double> (rand ( ) % maxWeight) ) ; 



inO { 
or<Tras 



/ Fill up the Trash bin 
nfoGen infoGen(3, 10 0); 
or (int i = 0; i < 3 0; i 
bin. push_back (Trash: : f , 
ector<Aluminum'^> al: 



ctor<Pa| 



ctor<Tr, 



pape 



ry (infoGen () ) ) ; 



= bin. begin () ; 



All 



Gl, 



the Trash: 
orter != bin.e. 
num*- ap = 
dynamic_cast<Alum 
pp = dynamic 
gp = dynamic 
f (ap) alBin.push_b, 
f(pp) pap. 
f(gp) gla 



t<Paper 
t<Glass 
ap); 

Bin.push_back (pp) ; 

Bin.push_back (gp) ; 



sumValue (alBin) ; 
sumValue (paperBin) ; 
sumValue (glassBin) ; 
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sumValue (bin) ; 
purge (bin) ; // Cleanup 
} ///:- 

In the factory method Trash: :factory(), the determination of the exact type of object is 
simple, but you can imagine a more complicated system in which factory( ) uses an elaborate 
algorithm. The point is that it's now hidden away in one place, and you know to come to this 
place to make changes when you add new types. 

The creation of new objects is now more general in iiiain(), and depends on "real" data 
(albeit created by another generator, driven by random numbers). The generator object is 
created, telling it the maximum type number and the largest "data" value to produce. Each call 
to the generator creates an Info object which is passed into Trash::factory( ), which in turn 
produces some kind of Trash object and returns the pointer that's added to the 
vector<Trash*> bin. 

The constructor for the Info object is very specific and restrictive in this example. However, 
you could also imagine a vector of arguments into the Info constructor (or directly into a 
factory( ) call, for that matter). This requires that the arguments be parsed and checked at 
runtime, but it does provide the greatest flexibility. 

You can see from this code what "vector of change" problem the factory is responsible for 
solving: if you add new types to the system (the change), the only code that must be modified 
is within the factory, so the factory isolates the effect of that change. 

A pattern for prototyping creation 

A |i 10 bleu « Irh tit above desi'n Is Ihil it still rcqii irn i c t g tril lo c i tic g fUrt ill 111 e l)-|ies 
of tilt objttts a gsl be tgc* g: igiidt tit factory() method. If new types are regularly being 
added to the system, the factory( ) method must be changed for each new type. When you 
discover something like this, it is useful to try to go one step further and move a^/ of the 
activities involving that specific type - including its creation — into the class representing that 
type. This way, the only thing you need to do to add a new type to the system is to inherit a 

To move the information concerning type creation into each specific type of Trash, the 
"prototype" pattern will be used. The general idea is that you have a master container of 
objects, one of each type you're interested in making. The "prototype objects" in this 
container are used only for making new objects. In this case, we'll name the object-creation 
member function clone( ). When you're ready to make a new object, presumably you have 
some sort of information that establishes the type of object you want to create. The factory( ) 
method (it's not required that you use factory with prototype, but they commingle nicely) 
moves through the master container comparing your information with whatever appropriate 
information is in the prototype objects in the master container. When a match is found, 
factory( ) returns a clone of that object. 

In this scheme there is no hard-coded information for creation. Each object knows how lo 
expose appropriate information to allow matching, and how to clone itself. Thus, the 
factory( ) method doesn't need to be changed when a new type is added to the system. 
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The prototypes will be contained in a static vector<Trash*> called prototypes. This is a 
private member of the base class Trash. The friend class TrashPrototypelnit is responsible 
for putting the Trash* prototypes into the prototype list. 

You'll also note that the Info class has changed. It now uses a string to act as type 
identification information. As you shall see, this will allow us to read object information from 
a file when creating Trash objects. 





C09:Tra3h.h 
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#if 


ndef TRASH_H 






#de 


fine TIUiSH.H 






#in 


elude <iostream> 






#in 


elude <exception> 






#in 


elude <vector> 






#in 


elude <string> 
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ss Visitor; // For a 
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xample 
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ss Trash { 






d 


ouble _weight; 








old operator- (const 


Trashfi) 





Trash(const Trashfi); 
public: 

Trash (double wt ) : _weightlwt) {} 
virtual double value ( ) const = 0; 
double weight const | return _weight; } 
virtual -Trash {} 
class Info { 

std: [String _id; 
double _data; 
public: 

Info (std: :string ident, double dat) 

: _id (ident), _data(dat) {} 
double dataO const | return _data; ] 
std::string id ( ) const | return _id; ) 
friend std : : ostreamfi operator<< ( 

std: lostreamfi os, const Infofi info) { 
return os « info._id « ':' « info._da 
1 

]; 

protected: 

// Remainder of class provides support for 
// prototyping: 
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friend class TrashPrototypelnit ; 
Trashl) : _weightlO) {] 
ublic: 
static Trash* factory (const InfoS info); 
virtual std::string id () = 0; // type id 
virtual Trash* clone (const InfoS) = 0; 
// Stubs, inserted for later use: 
virtual bool 

addToBin (std: : vector<TypedBin'->S ) { 
return false; 



#endif // TRASH_H / / / : - 



The basic part of the Trash class remains as before. The rest of the class supports the 
prototyping pattern. The id( ) member function returns a string that can be compared with the 
id{ ) of an Info object to determine whether this is the prototype that should be cloned (of 
course, the evaluation can be much more sophisticated than that if you need it). Both id( ) and 
clone( ) are pure virtual functions so they must be overridden in derived classes. 

The last two member functions, addToBin( ) and accept( ), are "stubs" which will be used in 
later versions of the trash sorting problem. It's necessary to have these virtual functions in the 
base class, but in the early examples there's no need for them, so they are not pure virtuals so 
as not to intrude. 

The factory( ) member function has the same declaration, but the definition is what handles 
the prototyping. Here is the implementation file: 

// : C09:Trash.cpp {0} 

#include "Trash. h" 
using namespace std; 

Trash* Trash :: factory (const InfoG info) { 
vector<Tra3h*>: [iterator it; 
for(it = prototypes. beginO; 

it != prototypes. end(); it++) { 
// Somehow determine the new type 
// to create, and clone one: 
if (info.idO == (*it)->id()) 
return ( * it ) ->clone ( inf o ) ; 
1 
cerr << "Prototype not found for " 

« info « endl; 
// "Default" to first one in the vector: 
return ( *prototypes . begin () ) ->clone (info) ; 
} III:- 
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The string inside tlie Info object contains the type name of tlie Trash to be created; this 
string is compared to the id( ) values of the objects in prototypes. If there's a match, then 
that's the object lo create. 

Of course, the appropriate prototype object might not be in the prototypes list. In this case, 
the return in the inner loop is never executed and you'll drop out at the end, where a default 
value is created. It might be more appropriate to throw an exception here. 

As you can see from the code, there's nothing that knows about specific types of Trash. The 
beauty of this design is that this code doesn't need to be changed, regardless of the different 
ill be used in. 



Trash subclasses 

To fit into ihe proloty ping scheni e, each new subclass of T rash must follow some rules. First, 
it must create a protected default constructor, so that no one bul TrashPrototypelnit may 
use it. TrashPrototypelnit is a singleton, creating one and only one prototype object for each 
subtype. This guarantees that the Trash subtype will be properly represented in the 
prototypes container. 

After defining the "ordinary" member functions and data that the Trash object will actually 
use, the class must also override the id( ) member (which in this case returns a string for 
comparison) and the clone( ) function, which must know how to pull the appropriate 
information out of the Info object in order to create the object correctly. 

Here are the different types of Trash, each in their own file. 

//: C09:Aluminum.h 

// The Aluminum class with prototyping 

#ifndef ALUMINUM_H 

#define ALUMINUM_H 

#include "Trash. h" 



protected: 

Aluminum 1) {] 

friend class TrashPrototypelnit; 
public: 

Aluminum (double wt ) : Trash (wt) {] 
double value const { return val ; 
static void value (double newVal ) { 

val = newVal; 
1 

std::string id() { return "Aluminum 
Trash*- clone (const InfoS info) { 

return new Aluminum ( info . data ()) ; 
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#endif // ALUMIWUM_H ///:- 

II : C09:Paper .h 

// The Paper class with prototyping 
lifndef PAPER_H 
#define PAPER_H 
#include "Trash. h" 

class Paper : public Trash { 

static double val ; 
protected: 

Paper {] 

friend class TrashPrototypelnit ; 
public: 

Paper (double wt ) : Trash (wt) {] 
double value const | return val ; 
static void value (double newVal ) { 

val = newVal; 
} 

std:: string id ( ) { return "Paper"; 
Trash*- clone (const InfoS info) { 

return new Paper ( info . data ()) ; 
} 

}; 

#endif // PAPER_H ///:- 

//: C09:Glass.h 

// The Glass class with prototyping 

lifndef GLASS_H 

#define GLASS_H 

#include "Trash. h" 

class Glass : public Trash { 

static double val ; 
protected: 

Glass {) 

friend class TrashPrototypelnit; 
public: 

Glass (double wt ) : Trash (wt) {] 

double value 1) const { return val; 

static void value (double newVal) { 
val = newVal; 

1 
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t Infos info) { 
s (info. data ( ) ) ; 



#endif // GLASS_H ///:- 

And here's a new type of Trash: 

//: C09:Cardboard.h 

// The Cardboard class with prototypi 

lifndef CARDBOARD_H 

Idefine CAiy)BOARD_H 

#include "Trash. h" 

class Cardboard : public Trash { 

static double val ; 
protected: 

Cardboard 1) {] 

friend class TrashPrototypelnit ; 
public: 

Cardboard (double wt ) : Trash (wt) {] 

double value 1) const { return val ; 

static void value (double newVal ) { 
val = newVal; 

1 

std::string idl) | return "Cardboar 

Trash* clone (const InfoS info) { 
return new Cardboard (info . data () ) 



#endif // CARDBOARD_H ///:- 

e static val data members must be defined and initialized in a separate code file: 

// : CD9:Trash3tatics . cpp {01 

// Contains the static definitions for 

// the Trash type's "val" data members 

linclude "Trash. h" 

linclude "Aluminum . h" 

linclude "Paper. h" 

linclude "Glass. h" 

#include "Cardboard . h" 



double Aluminum: :val = 1.67; 
double Paper: :val = 0.10; 
double Glass: :val = 0.23; 
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///:- 



rd: :val = 0.14; 



There's one other issue: initialization of the static data members. TrashPrototypeliiit must 
create the prototype objects and add them to the static Trasli::prototypes vector. So it's very 
important that you control the order of initialization of the static objects, so the prototypes 
vector is created before any of the prototype objects, which depend on the prior existence of 
prototypes. The most straightforward way to do this is to put all the definitions in a single 
file, in the order in which you want them initialized. 

TrashPrototypelnit must be defined separately because it inserts the actual prototypes into 
the vector, and throughout the chapter we'll be inheriting new types of Trash from the 
existmg types. By makmg this one class in a separate file, a different version can be created 
and linked in for the new situations, leaving the rest of the code in the system alone. 
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This is taken a step further by making TrashPrototypelnit a singleton (the constructor is 
private), even though the class definition is not available in a header file so it would seem 
safe enough to assume that no one could accidentally make a second instance. 

Unfortunately, this is one more separate piece of code you must maintain whenever you add a 
new type to the system. However, it's not too bad since the linker should give you an error 
message if you forget (since prototypes is defined in this file as well). The really difficult 
problems come when you don 't get any warnings or errors if you do something wrong. 



Parsing Trash from an external file 



The inform ation about Trash objects will be read from an outside file. The file has all of the 
necessary information about each piece of trash in a single entry in the form Trash: weight. 
There are multiple entries on a line, separated by commas: 

C09:Trash.dat 
s:54, Paper:22, Paper:ll, Glass:!?, 

, Paper:88, Aluminum : 76, Cardboard : 96, 
n:25, Aluminum:34, Glass:ll, Glass:68, 
Glass:43, Aluminum:27, Cardboard : 44 , Aluminum:18, 
Paper:91, Glass:63, Glass:50, Gla3s:80, 

, Cardboard:12, Gla33:12, Glass:54, 
n:36, Aluminum:93, Glass:93, Paper:80, 
Glass:36, Glass:12, Glass:60, Paper:66, 
Aluminum: 3 6, Cardboard : 22 , 
///:- 

To parse this, the line is read and the string member function flnd( ) produces the index of the 
':'. This is first used with the string member function substr() to extract the name of the 
trash type, and next to get the weight that Is turned into a double with the atof( ) function 
(from <cstdlib>). 

The Trash file parser is placed in a separate file since it will be reused throughout this 
chapter. To facilitate this reuse, the function flllBin( ) which does the work takes as its first 
argument the name of the file to open and read, and as its second argument a reference to an 
object of type Fillable. This uses what I've named the "interface" idiom at the beginning of 
the chapter, and the only attribute for this particular interface is that "it can be filled," via a 
member function addTrash( ). Here's the header file for Fillable: 

//: C09:Fillable.h 

// Any object that can be filled with Trash 

#ifndef FILLABLE_H 

#define FILLABLE_H 

class Fillable { 
public: 

virtual void addTrash ( Trash* t) = 0; 
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#endif // FILLABLE_H 1 1 1 : ~ 

Notice that it follows tlie interface idiom of liaving no non-static data members, and all pure 
virtual member functions. 

This way, any class which implements this interface (typically using multiple inheritance) can 
be filled using fillBin(). Here's the header file: 

//: C09:fillBin.h 

// Open a file and parse its contents into 

// Trash objects, placing each into a vector 

#ifndef FILLBIN_H 

#define FILLBIN_H 

#include "Fillablevector . h" 

#include <vector> 

#include <string> 



fillBin (std: :string filename, FillableS bin); 

// Special case to handle vector: 

inline void f illBin ( std :: string filename, 

std: : vector<Trash*>S bin) { 

Fillablevector fvlbin); 

f illBin (filename, fv) ; 
} 
#endif // FILLBIN_H ///:- 

The overloaded version will be discussed shortly. First, here is the impli 

//: C09:f illBin. cpp {01 

// Implementation of f illBin ( ) 

#include "fillBin.h" 

#include "Fillable.h" 

#include " . . /COl /trim . h" 

#include ". ./require. h" 

#include <fstream> 

#include <3tring> 

#incliide <cstdlib> 

using namespace std; 

void f illBin (string filename, FillableS bin) { 
ifstream in (filename . c_str ()) ; 
assure (in, f ilename . c_str ( ) ) ; 

while (getline (in, s) ) { 
int comma = s.findC, ') ; 
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while (comma != str ing : : npos ) { 

string e = trim (s . substr ( , comma )) ; 

// Parse each entry: 

int colon = e.findC : ' ) ; 

string type = e.siibstrlO, colon); 

double weight = 

atof(e.3ubstr (colon + l).c_str()); 
bin.addTrash ( 

Trash: : factory ( 

Trash: :Info (type, weight))); 
// Move to next part of line: 

comma = s . f ind ( ' , ' ) ; 



III:- 



After the file is opened, each line is read and parsed into entries by looking for the separating 
comma, then each entry is parsed into its type and weight by looking for the separating colon. 
Note the convenience of using the triin( ) function from chapter 17 to remove the white space 
from both ends of a string. Once the type and weight are discovered, an Info object is created 
from that data and passed to thefactory(). The result of this call is a Trash* which is passed 
to the addTrash( ) function of the bin (which is the only function, remember, that a Fillable 
guarantees). 

Anything that supports the Fillable interface can be used with fillBin(). Of course, vector 
doesn't implement Fillable, so it won't work. Since vector is used in most of the examples, it 
makes sense to add the second overloaded fillBin( ) function that takes a vector, as seen 
previously in flIIBin.h. But how to make a vector<Trash*> adapt to the Fillable interface, 
which says it must have an addTrash( ) member function? The key is in the word "adapt"; 
we use the adapter pattern to create a class that has a vector and is also Fillable. 

By saying "is also Fillable," the hint is strong (is-a) to inherit from Fillable. But what about 
the vector<Trash*>? Should this new class inherit from that? We don't actually want to be 
making a new kind of vector, which would force everyone to only use our vector in this 
situation. Instead, we want someone to be able to have their own vector and say "please fill 
this." So the new class should just keep a reference to that vector: 

//: C09:Fillablevector.h 

// Adapter that makes a vector<Trash* > Fillable 

#ifndef FILL ABLE VECTOR_H 

#define FILL ABLE VECTOR_H 

#include "Trash. h" 

#include "Fillable. h" 

#include <vector> 
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class Fillablevector : public Fillable { 

std: : vector<Trash'->fi v; 
public: 

Fill able vector (std: : vector<Trash* > S vv) 
: vlvv) !1 

void addTrash (Trash* t) { v . push_back ( t ) ; } 

(; 

#endif // FILLABLEVECTOR_H / / / : - 

You can see that the only job of this class is to connect Fillable' s addTrash() member 
function to vector's pnsh_back( ) (that's the "adapter" motivation). With this class in hand, 
the overloaded fillBin( ) member function can be used with a vector in fillBin.h: 

nline void fillBin (std :: string filename, 
std: : vector<Trash*>E bin) { 
Fillablevector fv(bin); 
f illBin (filename, fv) ; 

Notice that the adapter object fv only exists for the duration of the function call, and it wraps 
bin in an interface that works with the other fillBm( ) function. 

This approach works for any container class that' s used frequently. Alternatively, the 

n multiply inherit from Fillable. (You'll see this later, in DynaTrash.cpp.) 



Recycling with prototyping 



//: C09:Recycle3.cpp 


//|L} Tra 


shPrototypeIni 


//{L} fillBin Trash Tra 


// Recycl 


ing with RTTI 


#include 


"Trash. h" 


#include 


"Aluminum. h" 


#include 


"Paper. h" 


#include 


"Glass. h" 


#include 


"fillBin.h" 


#include 


"sumValue.h" 


#include 


" . . /purge. h" 


#include 


<fstreain> 


#include 


<vector> 


using nam 


espace std; 


ofstream 


out ("Recycles. 



// Fill up the Tr. 
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Bin 1 


Tra 


h.da 


t", b 


n) ; 


or<Alumi 


um'-> 


alBi 




or<Paper 


> pa 


perBi 




or<Glas3 


> gi 


assBi 




or<T 


ash 


>: :i 


terator it = b 


e(it 


!= bin.e 


ndl) ) 


{ 


Sort the Tra 


sh: 




umin 


m*- ap = 






dyna 


nic_cast<Alumi 


um*>(*it 


per* 


PP 


dyn 


amic_ 


ast<Pape 




gp 


dyn 


amic_ 


ast<Glas 


lap) 


alB 


n.pu 


sh_back(ap) ; 


IPP) 


paperBin 


.push_ 


_backlpp) 


igp) 


gla 


sBin 


.push 


_backlgp) 



sumValue (alBin) ; 
sumValue (paperBin) ; 
sumValue (glassBin) ; 
sumValue (bin) ; 
purge (bin) ; 
( ///:- 

The process of opening the data file containing Trash descriptions and the parsing of that file 
have been wrapped into nilBin(), so now it's no longer a part of our design focus. You will 
see that throughout the rest of the chapter, no matter what new classes are added, fillBin( ) 
will continue to work without change, which indicates a good design. 

In terms of object creation, this design does indeed severely locahze the changes you need to 
make to add a new type to the system. However, there's a significant problem in the use of 
RTTI that shows up clearly here. The program seems to run fine, and yet it never detects any 
cardboard, even though there is cardboard in the list of trash data! This happens because of 
the use of RTTI, which looks for only the types that you tell it to look for. The clue that RTTI 
is being misused is that every type in the system is being tested, rather than a single type or 
subset of types. But if you forget to test for your new type, the compiler has nothing to say 

As you will see later, there are ways to use polymorphism instead when you're testing for 
every type. But if you use RTTI a lot in this fashion, and you add a new type to your system, 
you can easily forget to make the necessary changes in your program and produce a difficult- 
to-find bug. So it's worth trying to eliminate RTTI in this case, not just for aesthetic reasons - 
it produces more maintainable code. 
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Abstracting usage 



With creation out of the way, it's time to tackle the remainder of the design: where the classe; 
are used. Since it's the act of sorting into bins that's particularly ugly and exposed, why not 
take that process and hide it inside a class? This is simple "complexity hiding," the principle 
of "If you must do something ugly, at least localize the ugliness." In an OOP language, the 
best place to hide complexity is inside a class. Here's a first cut: 



TrashSorter 










vector of 
Trash bins 



















,.-^ 


veto 


r<Aluminu 


-> 




!r-'\ 


v.ct = 


r<Paper.> 






:--^ 


vet- 


r<Glass*> 






'--f 


vecto 


r<Cartlboa 


i:d*> 



to vectors holding specific 
vector<vector<Trasli*», but it 



A TrashSorter object holds a vector that somehow 
types of Trash. The most convenient solution would be 
too early to tell if that would work out best. 

In addition, we'd like to have asort( ) function as part of the TrashSorter class. But, keeping 
in mind that the goal is easy addition of new types of Trash, how would the statically -coded 
sort( ) function deal with the fact that a new type has been added? To solve this, the type 
information must be removed from sort( ) so all it needs to do is call a generic function which 
takes care of the details of type. This, of course, is another way to describe a virtual function. 
So sort( ) will simply move through the vector of Trash bins and call a virtual function for 
each. I'll call the function grab(Trash*), so the structure now looks like this: 



veotor<Aluini 
bool grab(Tr 


ash*) ; 




veotor<Paper 
bool grab(Tr 


ash*) ; 




veotor<Gla33 
bool grab(Tr 


ash*) ; 




veotor<Cardb 
bool grab (Tr 


oard*> 
ash*) ; 
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However, TrashSorter needs lo call grab( ) polymorphic ally, through a common base class 
for all the vectors. This base class is very simple, since it only needs to establish the interface 
for the grab( ) function. 

Now there's a choice. Following the above diagram, you could put a vector of trash pointers 
as a member object of each subclassed Tbin. However, you will want to treat each Tbin as a 
vector, and perform all the vector operations on it. You could create a new interface and 
forward all those operations, but that produces work and potential bugs. The type we're 
creating is really a Tbin and a vector, which suggests multiple inheritance. However, it turns 
out that's not quite necessary, for the following reason. 

Each time a new type is added to the system the programmer will have to go in and derive a 
new class for the vector that holds the new type of Trash, along with its grab( ) function. 
The code the programmer writes will actually be identical code except for the type it's 
working with. That last phrase is the key to introduce a template, which will do all the work of 
adding a new type. Now the diagram looks more complicated, although the process of adding 
a new type to the system will be simple. Here, TrashBin can inherit from TBin, which 
inherits from vector<Trash*> like this (the multiple-lined arrows indicated template 
instantiation): 



TBin 


: publi 


2 V 


ector<Trash*> 


virt 


:ial bool 


gr 


ab(Trash*) ; 


t 



template TrashBin<TrashType> 
(implements grab ( ) ; ) 



bool sort [Trash*) , 



~f f f f 



,4 TrashBin<Paper: 



H Tra3hBin<Gla 



~ H TrashBin<Aluminum> 



J Tras!iBin<Cardboard> 



The reason TrashBi 
function. A further tempi; 

That said, we can look at the whole proj 

// : C09:RecYcle4 . cpp 
//|L} TrashPrototypelnit 

//{LI fillBin Trash TraE 
// Adding TrashBins and 



template is so it can automatically generate the grab( ) 
will allow the vectors to hold specific types. 

;e how all this is implemented. 
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#include "Trash. h" 
#include "Aluminum. h" 
#include "Paper. h" 
#include "Glass. h" 
#include "Cardboard . h" 
#include "fillBin.h" 
#include "sumValue . h" 
#include ". . /purge. h" 
linclude <fstream> 
linclude <vector> 
using namespace std; 
ofstream out ( "Recycle4 . out" ) ; 

class TBin : public vector<Trash'-> | 

virtual bool grab(Trash*) = 0; 



template<class TrashType> 
class TrashBin : public TBin { 
public: 

bool grab (Trash* t) { 

TrashType* tp = dynamic_cast<TrashType'-> (t ) ; 

iflltp) return false; // Not grabbed 

push_back ( tp ) ; 

return true; // Object grabbed 



class TrashSorter : public vector<TBin* > { 
public: 

bool sort (Trash*- t) { 

for(iterator it = begin () ; it != end ( ) ; it + +) 
if ( ('-it)->grab (t) ) 

return false; 
1 

void sortBin (vector<Trash'->6 bin) { 
vector<Trash*>: :iterator it; 

for(it = bin.beginO; it != bin.endO; it + +) 
if ( !sort C-it) ) 

cerr « "bin not found" « endl ; 



-TrashSorter { purge (* thi s ) ; 
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int mainO | 

vector<Trash'-> bin; 

// Fill up the Trash bin: 

fillBin ("Trash.dat", bin) ; 

TrashSorter thins; 

tbins .push_back (new TrashBin<Aluminum> ) ; 

tbins .push_back(new TrashBin<Paper> ) ; 

tbins .push_back(new TrashBin<Glass> ) ; 

tbins .push_back(new TrashBin<Cardboard> ) ; 

tbins. sortBin (bin) ; 

for (TrashSorter : [iterator it = tbins . begin () ; 
it != tbins. endO; it + +) 
sumValue ('"'■it) ; 

sumValue (bin) ; 

purge (bin) ; 
} III:- 



Tbins : 



Trash Sorter 



Vec 

Tra; 



;tor of T 
,sh Bins J 









Aluminum Vector 




boolean grab(Tra3h} 










V 


Paper Vector 




boolean grab (Trash) 










•■ ■> 


Glass Vector 




boolean grab(Trash] 



TrashSorter needs lo call each grab( ) member function and get a different result depending 
on what type of Trash the current vector is holding. That is, each vector must be aware of 
the type it holds. This "awareness" is accomplished with a virtual function, the grab( ) 
function, which thus eliminates at least the outward appearance of the use of RTTI. The 
implementation of grab()<ioes use RTTI, but it's templatized so as long as you put a new 
TrashBin in the TrashSorter when you add a type, everything else is taken care of 

Memory is managed by denoting bin as the "master container," the one responsible for 
cleanup. With this rule m place, caUing piirge( ) for bin cleans up all the Trash objects. In 
addition, TrashSorter assumes that it "owns" the pointers it holds, and cleans up all the 
TrashBin objects during destruction. 
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A basic OOP design principle is "Use data members for variation in state, use polymorphism 
for variation in behavior." Your first thought might be that ihe grab( ) member function 
certainly behaves differently for a vector that holds Paper than for one that holds Glass. But 
what it does is strictly dependent on the type, and nothing else. 



i through the Tbins when 



1 . TbinList holds a set of Tbin pointers, so that sort( ) can itera 
it's looking for a match for the Trash object you've handed it 

2. sortBin( ) allows you to pass an entire Tbin in, and it moves through the Tbin, picks out 
each piece of Trash, and sorts it into the appropriate specific Tbin. Notice the genericity 
of this code: it doesn't change at all if new types are added. If the bulk of your code 
doesn't need changing when a new type is added (or some other change occurs) then you 
have an easily-extensible system. 



Now you can see how easy it is to add a ni 
the addition. If it' s really important, you c: 
manipulating the design. 

One member function call causes 
specifically-typed bins. 



V type. Few lines must be changed to support 
I squeeze out even more by further 



of bin to be sorted into the respectiv 



Applying double dispatching 



\l lh)t JMii 


iwtfl.ill, lUhluH 


,.UJi.! !!.■ Iifii i« He iniij lo.siili . 


iddin; 


pki .1. .Hil 
possible to go o 
trash into bins. 


e step further and elim 


iiiT IS il 1 li ii RecycleLcpp. However, it 
inale RTTI altogether from the operation of si 


rting the 



To accomplish this, you must first take the perspective that all type -dependent activities - 
such as detecting the type of a piece of trash and putting it into the appropriate bin - should be 
controlled through polymorphism and dynamic binding. 

The previous examples first sorted by type, then acted on sequences of elements that were all 
of a particular type. But whenever you find yourself picking out particular types, stop and 
think. The whole idea of polymorphism (dynamically-bound member fiinction calls) is to 
handle type-specific information for you. So why are you hunting for types? 

The multiple-dispatch pattern demonstrated at the beginning of this chapter uses virtual 
functions to determine all type information, thus eliminating RTTI. 



Implementing the double dispatch 



In Itt Trashhierarchy we will now make use of the "stub 
was added to the base class Trash bul unused up until now 


virlu 
.Thi 


al function addToBin( ) that 
takes an argument of a 




TypedBin 








add (Aluminum) 
add (Paper) 
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492 





















Aluminums in 




PaperBin 




GlassBin 




CardljoardBin 


add (Aluminum) 


add (Paper) 


add(Glass) 


add(Cardboard) 



:r of TypedBin. A Trash object uses addToBiii( ) with this container to slep through 
and try to add itself to the appropriate bin, and this is where you'll see the double dispatch. 



addToBin (TypedBin [] ) 



addToBi: 



() 






n() 



addToBin addToBin () 



Cardboard 



The new hierarchy is TypedBin, and it contains its own member function called add( ) that is 
also used polymorphic ally. But here's an additional twist: add( ) is overloaded to lake 
arguments of the different types of Trash. So an essentialpart of the double dispatching 
scheme also involves overloading (or at least having a group of virtual functions to call; 
overloading happens to be particularly convenient here). 

// : C09:TypedBin.h 
#ifndef TYPEDBIN_H 
#define TYPEDBIN_H 
#include "Trash. h" 
#include "Aluminum . h" 
#include "Paper. h" 
#include "Glass. h" 
#include "Cardboard . h" 

// Template to generate double-dispatching 
// trash types by inheriting from originals: 
template<class TrashType> 
class DD : public TrashType | 



pr. 



ed: 



DD() : TrashType (0) {} 
friend class TrashPrototypelnit; 
public: 

DD (double wt ) : TrashType (wt ) {} 
bool addToBin (std: :vector<TypedBin'->6 tb ) { 
for lint i = 0; i < tb.sizel); i + +) 
if ltb[i]->add(this) ) 

1 

// Override clone () to create this new type: 

Trash* clone (const Trash: :Infofi info) { 
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ew DD (info. da 



// vector<Trash*> that knows how 

// grab the right type 

class TypedBin : public std::vect. 



bool addit (Tras 
push_backlt) ; 



virtual bool add ( DD<Aluminum> * ) { 

return false; 
1 
virtual bool add ( DD<Paper> * ) { 

return false; 
1 
virtual bool add ( DD<Glas s> * ) { 

return false; 
1 
virtual bool add ( DD<Cardboard> * ) ! 

return false; 
} 

); 

// Template to generate specific TypedBins : 

template<class TrashType> 

class BinOf : public TypedBin { 

public: 

// Only overrides add () for this specific type: 
bool add (TrashType* t) { return addltlt); 1 

}; 

#endif // TYPEDBIN_H ///:- 

In eachparticukr subtype of Aluminum, Paper, Glass, and Cardboard, the addToBiii( ) 

member function is implemented, but it looks like the code is exactly the same in each case. 
The code in each addToBin( ) calls add( ) for each TypedBin object in the array. But notice 
the argument: this. The type of this is different for each subclass of Trash, so the code is 
different. So this is the first part of the double dispatch, because once you're inside this 
member function you know you're Aluminum, or Paper, etc. During the call to add( ), this 
information is passed via the type of this. The compiler resolves the call to the proper 
overloaded version of add(). But since tb[i] produces a pointer to the base type TypedBin, 
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this call will end up calling a different member function depending on the type of TypedBiii 
that's currently selected. That is the second dispatch. 

You can see that the overloaded add( ) methods all return false. If the member function is not 
overloaded in a derived class, it will continue to return false, and the caller (addToBin( ), in 
this case) will assume that the current Trash object has not been added successfully to a 
container, and continue searching for the right container. 

In each of the subclasses of TypedBin, only one overloaded member function is overridden, 
according to the type of bin that's being created. For example, CardboardBin overrides 
add(DD<Cardboard>). The overridden member function adds the Trash pointer to its 
ir and returns Irue, while all the rest of theadd() methods in CardboardBin 

turn false, since they haven't been overridden. With C++ templates, you don't 
o explicitly write the subclasses or place the addToBin( ) member function in Trash. 



To SI 



up for prototyping the new types of trash, there must be a different initializi 

// : C09:DDTrashPrototypeInit.cpp {0} 

#include "TypedBin. h" 

#include "Aluminum . h" 

#include "Paper. h" 

#include "Glass. h" 

#include "Cardboard . h" 

std: : vector<Trash*> Trash: :prototypes; 

class TrashPrototypelnit { 
DD<Aluminuni> a; 
DD<Paper> p; 
DD<Glass> g; 
DD<Cardbo, 



: file: 



shPri 



rd> c; 
typein 



--{) { 

sh: [prototypes. pus: 
sh: [prototypes. pusi 
sh: :prototypes.pusl 
sh: :prototypes.pusi 



ic Tr 



shPr. 



typein 



ck(aa); 
ck(Sp); 
ck(Sg); 
ck(Sc); 

ingleto 



Here's the rest of the program: 

I // : CD9:DoubleDispatch.cpp 
//{LI DDTrashPrototypelnit 
I //{LI fillBin Trash TrashSt. 
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// Using multiple dispatching to handle mor 

// one unknown type during a member functio 

#include "TypedBin.h" 

#include "fillBin.h" 

#include "sumValue.h" 

#include ". . /purge. h" 

#include <io3tream> 

#include <fstream> 

using namespace std; 

of stream out ( "DoubleDi spatch . out" ) ; 

class TrashBinSet : public vector<TypedBin'- 
public: 

TrashBinSet ( ) { 

push_back (new BinOf <DD<Aluminum> > ) ; 

push_back(new BinOf <DD<Paper> >); 

push_back(new BinOf <DD<Glass> >); 

push_back(new BinOf <DD<Cardboard> > ) ; 

]; 

void sortlntoBins (vector<Trash*>S bin) { 
vector<Trash*>: [iterator it; 

forlit = bin. begin 1) ; it != bin . end ( ) ; 
// Perform the double dispatch: 
if 1 ! l'-it)->addToBin C-fhis) ) 

cerr << "Couldn't add " << '"it << e 
1 
-TrashBinSet 1) ! purge ( '"this ) ; } 



nt mainl) { 
vector<Trash'-> bin; 
TrashBinSet bins; 
// fillBinl) still works, wi 
// different objects are clo. 
fillBin ("Trash.dat", bin) ; 
// Sort from the master bin 
// individually-typed bins: 
bins .sortlntoBins (bin) ; 
TrashBinSet: :iterator it; 
for (it = bins. begin 1) ; it ! = 

sumValue (**it) ; 
II ... and for the master bi. 
sumValue (bin) ; 
purge (bin) ; 
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I 1 III:- 

TrashBinSet encapsulates all of the different types of TypedBins, along with the 
sortIntoBiiis( ) member function, which is where all the double dispatching takes place. You 
can see thai once the structure is set up, sorting into the various TypedBins is remarkably 
easy. In addition, the efficiency of two virtual calls and the double dispatch is probably better 
than any other way you could sort. 

Notice the ease of use of this system in n]ain( ), as well as the complete independence of any 
specific type information within niain( ). All other methods that talk only to the Trash base- 
class interface will be equally invulnerable to changes in Trash types. 

The changes necessary to add a new type are relatively isolated: you inherit the new type of 
Trash with itsaddToBin( ) member function, then make a small modification to TypedBin, 
and finally you add a new type into the vector in TrashBinSet and modify 
D D T ra shP rototyp e Init.cpp. 

Applying the visitor pattern 

N M' (liiidtr ippljii; I dMijg pilini t itl n tiliiil) J lift ri 1 1 n i 111 ih i Irnb -;g nil ; 
p[DHtii A \ dti iiitiitd MiliM ii tlii th^ln, lit v islli r p i tit [i '> \<,\\\\ lg iliti' Hi 

Fd[ llii plliig, 1 t I It ID iDi^ir ttmiiiitd > illi tpiii \in\ \\i idd ilitn 1 1' i i > \\\>.\i\ 
Trash to the system. Indeed, this pattern makes adding a new type of Trash more 
complicated. It looks like this: 



cept (Visitors) ; 



Aluminum 


accept (Vi 

V. visit 

J 


sitor& 
(this) 


V) { 



V 


s 


tor 


V 


s 


t (Aluminu 


m*) ; 


V 


s 


t (Paper*) 




V 


s 


t(Glass*) 




visit (Cardboa 


rd*) ; 



PriceVisitor 


visit (Aluminun 


*) I 


// Aluminum- 




// specific 


work 


visit (Paper*) 


I 


// Paper- 




// specific 
) 


work 
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Trash 




Visitor 




accept (Visitor) 


Visit (Aluminum) 
Visit (Paper) 
Visit (Glass) 






1 










1 


1 




1 






1 Paper 1 1 Glass 1 




A 














1 WeightVisitor 1 1 etc. 1 


Aluminum 








accept (Visitor v) ( 

V. visit (this) ; 
} 




PriceVisitor 




visit (Aluminum) { 

// Perform Aluminum- 












} 
vi 


sit (Paper) { 

// Perforin Paper- 

// specific work 





Now, if 1 is a Trash poin 



II Aluminuin object, the code: 



causes two polymorphic member function calls: the first one to select Aluminum's version of 
accept( ), and the second one within accept( ) when the specific version of visit( ) is called 
dynamically using the base-class Visitor pointer v. 

This configuration means that new functionality can be added lo the system in the form of 
new subclasses of Visitor. The Trash hierarchy doesn't need to be touched. This is the prime 
benefit of the visitor pattern: you can add new polymorphic functionality to a class hierarchy 
without touching that hierarchy (once the accept( ) methods have been installed). Note that 
the benefit is helpfiil here but not exactly what we started out to accomplish, so at first blush 
you might decide that this isn't the desired solution. 

Bui look at one thing that's been accomplished: the visitor solution avoids sorting from the 
master Trash sequence into individual typed sequences. Thus, you can leave everything in the 
smgle master sequence and simply pass through that sequence using the appropriate visitor to 
accomplish the goal. Although this behavior seems to be a side effect of visitor, it does give 
us what we want (avoiding RTTl). 

The double dispatching m the visitor pattern takes care of determining both the type of Trash 
and the type of Visitor. In the following example, there are two implementations of Visitor: 
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PriceVisitor to both < 
weights. 

You can see all of this implemented in the new, improved v 
As with DoubleDispatch.cpp. the Trash class has had an e 
(accept( )) inserted in it to allow for this example. 



n the price, and WeightVisitor to keep tiack of the 



ion of the recycling program, 
a member function stub 



e there's nothing 



n the Visitor base c 



//: C09:Visitor.h 
// The base interface 
// and template for vi 
#ifndef VISITOR_H 
#define VISITOR_H 
#include "Trash. h" 
#include "Aluminum . h" 
#include "Paper. h" 
#include "Glass. h" 
#include "Cardboard . h" 



uminum* a) = 0; 
per* p) = 0; 
ass* g) = 0; 
rdboard* c) = 



// Template to generate visitable 

// trash types by inheriting from originals: 

template<cla3 3 TrashType> 

class Visitable : public TrashType { 

protected: 

Visitable 1) : TrashTypelO) {] 
friend class TrashPrototypelnit ; 

public: 

Visitable (double wt ) : TrashType ( wt ) { ) 
// Remember "this" is pointer to current type: 
void accept (Visitors v) | v . visit (this ) ; ] 
// Override clone ( ) to create this new type: 
Trash* clone (const Trash: :InfoS info) { 
return new Vi s itable ( info . data ()) ; 



ndif // VISITOR_H 
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As before, a different Vi 

//: C09:Visit 
#include "Vie 



IS necessary: 

t.cpp {01 



le<Aliiminum> a ; 








le<Paper> p; 








le<Glass> g; 








le<Cardboard> c 


; 






ototypelnitO { 








: : prototypes. pu 


sh 


_ba 


cklGa) 


: : prototypes. pu 


sh 


_ba 


cklGp) 


: [prototypes. pu 


sh 


_ba 


cklEg) 


: iprototypes.pu 


sh 


_ba 


cklEc) 



The rest of the program creates specific Visitor types and sends them through a single list of 
Trash objects: 

// : C0 9:TrashVisitor .cpp 

//!L} VisitorTrashPrototypelnit 

//|L} fillBin Trash TrashStatics 

// The "visitor" pattern 

#include "Visitor. h" 

#include "fillBin.h" 

#include ". . /purge. h" 

#include <iostream> 

linclude <fstream> 



ass PriceVisitor 


: PL 


bl 




double 


alSur 


n; // 


Alun 


in 


m 


double 


pSum 


// 


Paper 




double 


gSum 


// 


Glas. 
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oid visit (Aluminum* al ) { 
double V = al->weightl) * al->value ( ) ; 
out « "value of Aluminum- " « v « e 



oid visit (Paper* p) { 
double V = p->weight() * p->value () ; 



"value of Paper= 



oid visit (Glass* g) { 
double V = g->weight() * g->value () ; 



"value of Gla 



oid visit (Cardboard* c) { 

double V = c->weight() * c->value(); 



"value of Cardboard = 



id total (ostreamS os) { 

"Total Aluminum: S" « alSum « "\n" < 

"Total Paper: 5" « pSum « "\n" « 

"Total Glass: S" << gSum << "\n" << 

"Total Cardboard: S" « cSum « endl; 
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oid visit (Paper*- p) { 
pSum += p->weight () ; 
out << "weight of Paper = " 
« p->weight 1) « endl; 

oid visit (Glass'- g) { 
gSum += g->weight () ; 
out « "weight of Glass = " 
« g->weight () « endl ; 

oid visit (Cardboard* c) { 
cSum += c->weight () ; 
out << "weight of Cardboard = " 
« c->weight « endl; 

oid total (ostreamS os ) { 

OS << "Total weight Aluminum:" 

« alSum « endl; 

OS << "Total weight Paper:" 

<< pSum << endl; 

OS << "Total weight Glass:" 

<< gSum << endl; 

OS << "Total weight Cardboard:" 

« cSum « endl; 



nt mainO | 

vector<Trash'-> bin; 

// fillBinO still works, without changes, 

// different objects are prototyped: 

fillBin ("Trash.dat", bin) ; 

// You could even iterate through 

// a list of visitors! 

PriceVisitor pv; 

WeightVisitor wv; 

vector<Trash*>: :iterator it = bin.beginO; 

while(it != bin.endO) { 

(*it)->accept(pv); 

(*it)->accept(wv); 



pv. total (out) ; 
wv. total (out); 
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purge (bin) ; 
I ( III:- 

Note that the shape of inain( ) has changed agam. Now there's only a single Trash bin. The 
two Visitor objects are accepted into every element in the sequence, and they perform their 
operations. The visitors keep their own internal data to tally the total weights and prices. 

Finally, there's no run-time type identification other than the inevitable cast to Trash when 
pulling things out of the sequence. 

One way you can distinguish this solution from the double dispatching solution described 
previously is to note that, in the double dispatching solution, only one of the overloaded 
methods, add(), was overridden when each subclass was created, while here each one of the 
overloaded visit( ) methods is overridden in every subclass of Visitor. 



More coupling? 



There's a lot more code here, and there's definite coupling between Ihe Trash hierarchy and 
the Visitor hierarchy. However, there's also high cohesion within the respective sets of 
classes: they each do only one thing (Trash describes trash, while Visitor describes actions 
performed on Trash), which is an indicator of a good design. Of course, in this case it works 
well only if you're adding new Visitors, but it gets in the way when you add new types of 
Trash. 

Low coupling between classes and high cohesion within a class is definitely an important 
design goal. Applied mindlessly, though, it can prevent you from achieving a more elegant 
design. It seems that some classes inevitably have a certain intimacy with each other. These 
often occur in pairs that could perhaps be called couplets, for example, containers and 
iterators. The Trash-Visitor pair above appears to be another such couplet. 

RTTI considered harmful? 

f irins dtiin! ii lUi thfin ii[iii pig [in o w i I I I. > ml g i;h ; iw u ii tit ii p[iniDi 
llil il's '(onilMdini fir llhi coHii mill ml (if pill goto). This isn't true; it is the 
misuse of RTTI that is the problem. The reason our designs removed RTTI is because the 
misapplication of that feature prevented extensibility, which contravened the stated goal of 
adding a new type to the system with as little impact on surrounding code as possible. Since 
RTTI is often misused by having it look for every single type in your system, it causes code to 
be non -extensible: when you add a new type, you have to go hunting for all the code in which 
RTTI is used, and if you miss any you won't get help from ihe compiler. 

However, RTTI doesn't automatically create non-extensible code. Let's revisit the trash 
recycler once more. This time, a new tool will be introduced, which I call a TypeMap. It 
inherits from a map that holds a variant of type_info object as the key, and vector<Trash*> 
as the value. The interface is simple: you call addTrash( ) to add a new Trash pointer, and 
the map class provides the rest of the interface. The keys represent the types contained in the 
associated vector. The beauty of this design (suggested by Larry O'Brien) is that the 
TypeMap dynamically adds a new key-value pair whenever it encounters a new type, so 
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whenever you add a new type to the system {even if you add the new type at runtime), it 

The example will again build on the structure of the Trash types, and will use flllBin( ) to 
parse and insert the values into the TypeMap. However, TypeMap is not a vector<Trash*>, 
and so it must be adapted to work with fillBin( ) by multiply inheriting from Fillable. In 
addition, the Standard C-H- type_info class is too restrictive to be used as a key, so a kind of 
wrapper class Typelnfo is created, which simply extracts and stores the type_info char* 
representation of the type (making the assumption that, within the realm of a single compiler, 
this representation will be unique for each type). 
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public: 

TypelnfolT* t) : id ( typeid ( '"t ) . name ( ) ) {) 
const strings name () { return id; 1 
friend bool operator< (const TypelnfoS Iv, 

const Typelnfofi rv) { 

return Iv.id < rv.id; 
} 

); 

class TypeMap : 

public map<TypeInfo<Trash>, vector<Trash* > 
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public Finable { 

// Satisfies the Eillable interface: 
void addTrash (Trash* t) { 

(*this) [TypeIiifo<Trash> (t) ] .push_back (t) ; 
1 
-TypeMap () { 

forliterator it = begin () ; it != end ( ) ; it 
purge ( l*it) .second) ; 



int mainl) { 

TypeMap bin; 

fillBin ("Trash.dat", bin) ; // Sorting happens 

TypeMap: : iterator it; 

for(it = bin.beginO; it != bin.endl); it + +) 
sumValue ( (*it) .second) ; 
} III:- 

Typelnfo is templatized because tj'peid( ) does not allow the use of void*, which would be 
the most general way to solve the problem. So you are required to work with some specific 
class, but this class should be the most base of all the classes in your hierarchy. Typelnfo 
must defme an operator< because a map needs it to order its keys. 

Although powerful, ihe definition for TypeMap is simple; the addTrash( ) member function 
does most of the work. When you add a new Trash pointer, the a TypeInfo<Trash> object 
for that type is generated. This is used as a key to determine whether a vector holding objects 
of that type is already present in the map. If so, the Trash pointer is added to that vector. If 
not, the Typelnfo object and a new vector are added as a key-value pair. 

An iterator to the map, when dereferenced, produces a pair object where the key (Typelnfo) 
is the first member, and the value (Vector<Trash*>) is the second member. And that's all 

The TypeMap takes advantage of the design of fillBin(), which doesn't just try to fill a 
vector but instead anything that implements the Fillable interface with its addTrash( ) 
member function. Since TypeMap is multiply inherited from Fillable, it can be used as an 
argument to fillBin( ) like this: 

I fillBinC'Trash.dat", bin); 

An interesting thing about this design is that even though it wasn't created to handle the 
sorting, fiIlBin( ) is performing a sort every time it inserts a Trash pointer into bin. When the 
Trash is thrown into bin it's immediately sorted by TypeMap's internal sorting mechanism. 
Stepping through the TypeMap and operating on each individual vector becomes a simple 
matter, and uses ordinary STL syntax. 
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As you can see, adding a new lype to the system won't affect this code at all, nor the code in 
TypeMap. This is certainly the smallest solution to the problem, and arguably the most 
elegant as well. It does rely heavily on RTTI, but notice that each key-value pair in the map is 
looking for only one type. In addition, there's no way you can "forget" to add the proper code 
to this system when you add a new type, since there isn't any code you need to add, other than 
that which supports the prototyping process (and you'll find out right away if you forget that). 



Summary 



[ 1 1 Id ; I f I ill s J ( si; 1 SD 1 1 is TrashVisitor.cpp that contains a larger amount of code 
than the earlier designs can seem at first to be counterproductive. It pays to notice what you're 
trying to accomplish with various designs. Design patterns in general strive to separate the 
things that change from the things that stay the same. The "things that change" can refer to 
many different kinds of changes. Perhaps the change occurs because the program is placed 
into a new environment or because something in the current environment changes (this could 
be: 'The user wants to add a new shape to the diagram currently on the screen"). Or, as in this 
case, the change could be the evolution of the code body. While previous versions of the 
trash-sorting example emphasized the addition of new types of Trash to the system, 
TrashVisitorA;pp allows you to easily add new functionality without disturbing the Trash 
hierarchy. There's more code in TrashVisitor.cpp, but adding new functionality to Visitor is 
cheap. If this is something that happens a lot, then it's worth the extra effort and code to make 
it happen more easily. 

The discovery of the vector of change is no trivial matter; it's not something that an analyst 
can usually detect before the program sees its initial design. The necessary information will 
probably not appear until later phases in the project: sometimes only at the design or 
implementation phases do you discover a deeper or more subtle need in your system. In the 
case of adding new types (which was the focusof most of the "recycle" examples) you might 
realize that you need a particular inheritance hierarchy only when you are in the maintenance 
phase and you begin extending the system! 

One of the most important things that you'll leam by studying design patterns seems to be an 
about-face from what has been promoted so far in this book. That is: "OOP is all about 
polymorphism." This statement can produce the "two-year-old with a hammer" syndrome 
(everything looks like a nail). Put another way, it's hard enough to "get" polymorphism, and 
once you do, you try to cast all your designs into that one particular mold. 

What design patterns say is that OOP isn't just about polymorphism. It's about "separating the 
things that change from the thmgs that stay the same." Polymorphism is an especially 
important way to do this, and it turns out to be helpful if the programming language directly 
supports polymorphism (so you don't have to wire it in yourself, which would tend to make it 
prohibitively expensive). But design patterns in general show other ways to accomplish the 
basic goal, and once your eyes have been opened to this you will begin to search for more 
creative designs. 

Since the Design Patterns book came out and made such an impact, people have been 
searching for other patterns. You can expect to see more of these appear as time goes on. Here 
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are some siles recommended by JimCoplien, of C-H- fame (litip: //www. beU-labs.com/~cope), 
who is one of the main proponents of the patterns movement: 

http://st-www.cs.uiuc.edii/users/patterns 

http://c2.eom/cgi/u iki 

http://c2 .com/ppr 

http ://w w w .be U-lab s CO m/people/cope/P atterns/Process/index.html 

http://www.be U-labs com/cgi user/OrgPatterns/OrgPatterns 

http://st-www.cs.umc edu/cgi bm/wikic/wikic 

http ://w w w .cs.wustledu/-schmid t/pattern s . html 

http ://w w w.espincco m/p attern s/o verv ie w .html 

Also note there has been a yearly conference on design patterns, called PLOP, that 
produces a published proceedings. The third one of these proceedings came out in 
late 1997 (all published by Addison-Wesley). 



Exercises 



1. Using SingletonPattem.cpp as a starting point, create a class that manages 
a fixed number of its own objects. Assume the objects are database 
connections and you only have a license to use a fixed quantity of these at 

2. Create a minimal Observer-Observable design in two classes, without base 
classes and without the extra arguments in Observer.h and the member 
functions in Observable.h. Just create the bare minimum in the two classes, 
then demonstrate your design by creating one Observable and many 
Observers, and cause Ihe Observable to update the Observers. 

3. Change InnerClassIdiom.cpp so that Outer uses multiple inheritance 
instead of the inner class idiom. 

4. Add a class Plastic to Trash Visitor .cpp. 

5. Add a class Plastic to DynaTrash.cpp. 

6. Explain how AbstractFactory.cpp demonstrates Double Dispatching and 
the Factory Method. 

1 . Modify SbapeFactory2^pp so that it uses An Abstract Factory to create 

different sets of shapes (for example, one particular type of factory object 
creates "thick shapes," another creates "thin shapes," but each factory object 
can create all the shapes: circles, squares, triangles etc.). 

8. Create a business-modeling environment with three types of Inhabitant: 
Dwarf (for engineers). Elf (for marketers) and Troll (for managers). Now 
create a class called Project that creates the different inhabitants and causes 
them to iDteract( ) with each other using multiple dispatching. 

9. Modify the above example to make the interactions more detailed. Each 
Inhabilant can randomly produce a Weapon using getWeapon( ): a 
Dwarf uses Jargon or Flay, an Elf uses InventFeature or 
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Se) II ma ginarj' Product, and a Troll uses Edict and Schedule. You must 
decide which weapons "win" and "lose" in each interaction (as in 
PaperScissorsRock.cpp). Add a battle( ) member function to Project that 
takes two Inhabitants and matches them against each other. Now create a 
ineeting( ) member function for Project that creates groups of Dwarf, Elf 
and Manager and battles the groups against each other until only members 
of one group are left standing. These are the "winners." 
Implement Chain of Responsibility to create an "expert system" that solves 
problems by successively trying one solution after another until one 
matches. You should be able to dynamically add solutions to the expert 
system. The test for solution should just be a string match, but when a 
solution fits, the expert system should return the appropriate type of 
problemSolver object. What other pattern/patterns show up here? 
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11: Tools & topics 

Tools created & used during the development of this book 
and various other handy things 

The code extractor 

The code for this book is automatically extracted direclly from the ASCII text version of this 
book. The book is normally maintained in a word processor capable of producing camera- 
ready copy, automatically creating the table of contents and index, etc. To generate the code 
files, the book is saved into a plain ASCII text file, and the program in this section 
automatically extracts all the code files, places them in appropriate subdirectories, and 
generates all the makefiles. The entire contents of the book can then be built, for each 
compiler, by invoking a single make command. This way, the code listings in the book can be 
regularly tested and verified, and in addition various compilers can be tested for some degree 
of compliance with Standard C++ (the degree to which all the examples in the book can 
a particular compiler, which is not too bad). 



The code in this book is designed to be as generic as possible, but it is only tested under two 
operating systems: 32-bit Windows and Linux (using the Gnu C++ compiler g++, which 
means it should compile under other versions of Unix without too much trouble). You can 
easily get the latest sources for the book onto your machine by going to the web site 
www.BrDceEckel.com and downloading the zipped archive containing all the code files and 
makefiles. If you unzip this you'll have the book's directory tree available. However, it may 
not be configured for your particular compiler or operating system. In this case, you can 
generate your own using the ASCII text file for the book (available at www.BruceEckel.com) 
and the ExtractCode.cpp program in this section. Using a text editor, you find the 
CompileDB.txt file inside the ASCII text file for the book, edit it (leaving it the book's text 
file) to adapt it to your compiler and operating system, and then hand it to the ExtractCode 
program to generate your own source tree and makefiles. 

You've seen that each file to be extracted contains a starting marker (which includes the file 
name and path) and an ending marker. Files can be of any type, and if the colon after the 
comment is directly followed by a '!' then the starting and ending marker lines are not 
reproduced in the generated file. In addition, you've seen the other markers {O}, {L}, and {T} 
that have been placed inside comments; these are used to generate the makefile for each 
subdirectory. 



If there's a mistake in the input file, then the program must report the error, which is the 
error( ) function at the beginning of the program. In addition, directory manipulation is not 
supported by the standard libraries, so this is hidden away in the class OSDirControl. If you 
discover that this class will not compile on your system, you must replace the non -portable 
function calls in OSDirControl with equivalent calls from your library. 

Although this program is very useful for distributing the code in the book, you'll see that it's 
also a useful example in its own right, since it partitions everything into sensible objects and 
also makes heavy use of the STL and the standard string class. You may note that one or two 
pieces of code might be duplicated from other parts of the book, and you might observe that 
some of the tools created within the program might have been broken out into their own 
reusable header files and cpp files. However, for easy unpacking of the book's source code it 
made more sense to keep everything lumped togethei' in a single file. 



// : CIO :ExtractCode.i: 
// Automatically ext: 
// ASCII text of thi: 
#include <iostream> 
#include <fstream> 
#include <string> 
#include <vector> 
#include <map> 
#include <3et> 
#include <algorithm> 
usinq namespace std; 



■PP 



ing copyright = 



"// Available 
■■// (c) Bruce 
"// Copyright 



t http: //www.BruceEc 

ckel 1999\n" 

otice in Copyright. t 



" Usage:ExtractCode source\n" 

"where source is the ASCII file containing \n" 
"the embedded tagged sourcef iles . The ASCII \n" 
"file must also contain an embedded compiler\n" 

"configuration file called CompileDB.txt \n" 

"See Thinking in C++, 2nd ed. for details\n"; 



Tool to remove the white spa 
ring trini(const strings s) { 
if (s.lengthO == 0) 
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int b = s.find_first_not_of 1" \t" ) ; 
int e = s.find_la3t_not_of 1" \t"); 
if lb == -1) // No non-spaces 

return string (s, b, e - b + 1); 



// Manage all the error messaging: 
void error (string problem, string message) { 
static const string border ( 

class ErrReport { 

string fname; 
public: 

of stream errs; 

ErrReport (char* fn = "ExtractCodeErrors.txt") 
: count 10),fnamelfn), errs (fname. c_strl) ) {} 
void operator + + lint) { count + +; ) 
-ErrReport () { 

if stream in (fname . c_str ()) ; 

cerr « in . rdbuf () « endl; 

cerr « count « " Errors found" « endl; 

cerr << "Messages in " << fname << endl; 



// Created on first call to this function; 

static ErrReport report; 
report++; 

<< "Problem spot: " << problem << endl; 
} 

///// OS-specific code, hidden inside a class 

#ifdef GNUC // Eor gcc under Linux/Unix 

#include <unistd.h> 
linclude <3ys/stat.h> 
linclude <stdlib.h> 
class OSDirControl ( 
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char path[PATH_MAX] ; 
getcwd(path, PATH_MAX) ; 
return string (path ) ; 

1 

static void makeDir ( string dir) { 

mkdir ldir.c_str () , 0111) ; 
1 
static void changeDir ( string dir) 

chdir ldir.c_str () ) ; 
1 

(; 

#else // For Dos/Windows: 
#include <direct.h> 
class OSDirControl { 
public: 

static string getCurrentDir () { 
char path[_MAX_PATH] ; 
getcwd(path, _MAX_PATH ) ; 
return string (path ) ; 
1 

static void makeDir ( string dir) { 
mkdir ldir.c_str () ) ; 



atic void changeDir (s 
chdir ldir.c_str () ) ; 



Jtendif ///// End of OS-specific code 

class PushDirectory { 

string oldpath; 
public: 

PushDirectory (string path); 
-PushDirectory () { 

OSDirControl : : changeDir (oldpath) ; 
1 
void pushOneDir (string dir) { 

OSDirControl : : makeDir (dir) ; 

OSDirControl : : changeDir (dir) ; 



Appendix B: Programming Guidehn 



PushDirectory: :PushDirectory (string path) | 
oldpath = OSDirControl : :getCurrentDir 1 ) ; 
while (path. length != 0) { 
int colon = path . find (':') ; 
if (colon != 3tring::npo3) { 

pushOneDir (path. subs tr (0, colon) ) ; 
path = path. substr (colon + 1); 
1 else { 

pushOneDir (path) ; 



// Manage code files 

// A CodeFile object knows everything about a 
// particular code file, including contents, path 
// information, how to compile, link, and test 
// it, and which compilers it won't compile with, 
enum TType (header, object, executable, none}; 

class CodeFile { 

TType _targetType; 

string _rawName, // Original name from input 
_path, // Where the source file lives 
_file, // Name of the source file 
_base, // Name without extension 

_testArgs; // Command-line arguments 
vector<string> 

lines, // Contains the file 
_compile, // Compile dependencies 
_link; // How to link the executable 

_noBuild; // Compilers it won't compile with 
bool writeTags; // Whether to write the markers 
// Initial makefile processing for the file: 
void target(const strings s ) ; 
// For quoted #include headers: 
void headerLine (const strings s ) ; 
// For special dependency tag marks: 
void dependLine (const strings s ) ; 
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.rings rawName ( ) { return _rawNa! 
.rings path ( ) | return _path; } 
.rings file() | return _file; } 
.rings base ( ) | return _base; ] 
.rings targetName () { return _tn, 
irgetTypeO | return _targetType 
:ctor<string>S compile () { 

.ctor<string>6 linkl) { 



const set<string>S noBuild ( ) { 

return _noBuild; 
1 

const strings testArgs () { return _testAr 
// Add a compiler it won't compile with: 
void addFailure (const strings failure) { 

_noBuild. insert (failure) ; 
1 
bool compilesOK ( string compiler) { 

return _noBuild . count ( compiler ) == 0; 
} 

friend ostreamS 
operator<< (ostreamS os, const CodeFileS c 

copy (cf. lines. begin () , cf . 1 ines . end ( ) , 
ostream_iterator<string> (os, "") ) ; 



old writel) { 
PushDirectory pdl_path); 
of stream 1 i sting l_file . c_str ( ) ) ; 
listing « '"this; // Write the file 



id dumpInfo(o 



old CodeFile: :target (const strings s) { 
// Find the base name of the file (witho" 
// the extension) : 

int lastOot = _f ile . f ind_last_of ( ' . ' ) ; 
ifllastDot == string: :npos) { 
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exit (1) ; 
} 

_base = _file.substr (0, lastDot); 
// Determine the type of file and targe 
if (s.findC'.h") != string: :npos || 

_targetType = header; 



if (s.findC'.txf) != string::npos 

II s.findC'.TXT") != string: :npos 

II s.findC'.DAT") != str ing : : npos ) { 
// Text file, not involved in make 
_targetType = none; 



// C++ objs/exes depend on their own source: 

_compile . push_back (_f ile ) ; 

if Is.findC'IO}") != string: :npos) { 

// Don't build an executable from this file 

_targetType = object; 

] else ! 

_targetType = executable; 

_tnan\e = _base; 

// The exe depends on its own object file: 

_link.push_back l_base) ; 



■old CodeFile: :headerLine (const 
int start = s.find('\"') ; 
int end = s . f ind ( ■ \ " ' , start 



id CodeFile: :dependLine (const st 
const string 1 inktag ( " / / { L 1 "); 
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string deps = tr im ( s . substr ( 1 inktag . length ())) ; 
while (true) { 

int end = deps.findl' '); 

string dep = deps . substr ( , end) ; 

_link.push_back (dep) ; 

if lend == str ing : : npos ) // Last one 
break; 

deps = trim (deps . substr (end) ) ; 



CodeFile: :CodeFile (i streams in, strings s) { 
// If false, don't write begin S end tags: 
writeTags = (s[3] != ' ! ' ) ; 
// Assume a space after the starting tag: 

// There will always be at least one colon: 
int lastColon = _f ile . f ind_last_of ( ' : ' ) ; 
ifllastColon == string :: npos ) { 

lastColon = 0; // Recover from error 
} 

_rawName = trim (_f ile ) ; 
_path = _file. substr (0, lastColon); 
_file = _file. substr (lastColon + 1); 
_file =_file. substr (O,_file.find_last_of (' ')); 
cout « "path = [" « _path « "] " 

« "file = [" « _file « "]" « endl; 
target(3); // Determine target type 
if (writeTags)! 

lines. pu3h_backls + ' \n ' ) ; 

lines . push_back ( copyright ) ; 
1 

while Igetline (in, s2 ) ) { 

// Look for specified link dependencies: 

if (s2.find("//|L}") == 0) // 0: Start of line 

dependLine(s2) ; 
// Look for command-line arguments for test: 
if (s2.find("//|Tl") == 0) // 0: Start of line 
_testArgs = s2 . substr (strlen ("// | T )" ) + 1); 
// Look for quoted includes: 
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if (s2.find("#include \"") != str ing : : npos ) { 
headerLine ls2) ; // Grab makefile info 

1 

// Look for end marker: 

if ls2.find('V/" '■/:-") != str ing :: npos ) { 
if (writeTags) 

lines. push_backls2 + ' \n ' ) ; 
return; // Found the end 

1 

// Make sure you don't see another start: 

if ls2.findl"//" ":") != string::npos 

error (s, "Error: new file started before" 

" previous file concluded"); 



// Write ordinary line: 
lines. push_back (s2 + ' \n ' ) ; 



old CodeFile: :dumpInfo (ostreamfi os ) { 
OS « _path « ':' « _file « endl; 
OS « "target: " « _tname « endl; 
OS << "compile: " << endl; 
for(int i = 0; i < _compile . size ( ) ; i + +) 

OS « '\t' « _compile[i] « endl; 
OS « "link: " « endl; 
for(int i = 0; i < _link . size ( ) ; i + +) 

OS « '\t' « _link[il « endl; 
if LnoBuild.sizel) != 0) { 

OS << "Won't build with: " << endl; 

copy (_noBuild.begin () , _noBuild.end () , 



// Manage compiler information — 

class CompilerData | 

// Information about each compiler: 
vector<string> rules; // Makefile rules 
set<string> fails; // Non-compiling file 
string ob jExtens ion ; // File name extens 
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// information and failure listingi 
// compiling the book files: 
static void readDB (istreamS in); 
// For enumerating all the com; " 
static set<string>S compilerNai 
return _corapilerName3 ; 



// For OS-specific activities: 
bool _dos, _unix; 

// Store the information for all the compiler 
static map<string, CompilerData> compilerlnfo 
static set<string> _compilerNames ; 
public: 

CompilerDataO : _dos(false), _unixlfalse) {) 
// Read database of various compile 

II ■ " ■ " 

// . 

// For enumerating all the compile 

:et<string>S compilerNames ( ) { 
L _compilerNai 

1 

// Tell this CodeFile which compilers 

// don't work with it: 

static void addFailures (CodeFileS cf); 

// Produce the proper object file name 

// extension for this compiler: 

static string obj (string compiler); 

// Produce the proper executable file name 

// extension for this compiler: 

static string exe (string compiler); 

// For inserting a particular compiler's 

// rules into a makefile: 

writeRules (string compiler, ostreamS os); 

// Change forward slashes to backward 

// slashes if necessary: 

static string 

ad justPath (string compiler, string path); 

// So you can ask if it's a Unix compiler: 

static bool isUnix (string compiler) { 

return compilerlnfo [ compiler ] ._unix; 
} 

// So you can ask if it's a dos compiler: 
static bool isDos (string compiler) { 

return compilerlnfo [ compiler ] ._dos; 
1 
// Display information (for debugging) : 
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// St 



"ing, CompilerData> 
CompilerOata : : compilerinf o; 
:t<string> CompilerOata: : _compilerNames ; 

idDB (i streams in) { 

compiler; // Name of current compi 

while (getline (in, s) ) { 

if (s.findl"*//" '■/:-") == 0) 
return; // Found end tag 
Is); 



lid CompilerOata: : re 



// Blank 



s = trimls); 

if (s.lengthl) == 0) cont 
ifls[0] == •#•) continue 
if ls[0] == •{ ') ! // Dif 

compiler = s.substr(0, s . find ('}')) ; 

compiler = trim (compiler . substr ( 1 ) ) ; 

if (compiler .length () != 0) 

_compilerNames . insert ( compiler ) ; 

continue; // Changed compiler name 
1 
ifls[0] == •(') { // Object file exten 

obj = trim (obj .substr (0, obj.findC) 
compiler Info [ compiler] . ob j Ext ens ion 



^compiler] . ob jExtens ion =obj; 



ifls[Oj = = 



•[•) ! // Executable extensi 

exe = trim(exe. substr (0, exe.findC]' 

compilerinf o [ compiler ] . exeExtens ion = 

1 

ifls[0] == 'S') I // Special directive 
if Is.findC'dos") != string: :npos) 

compilerinf o [ compiler ] ._dos = true; 

else if Is.findC'unix") != string::npo 

compilerinf o [ compiler ] ._unix = true 

error ("Compiler Information Oatabas 
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= ' e ) i / / Maxelile rule 
string rule ( s . substr ( 1) ) ; // Remove the @ 
iflrule[0] == • •) // Space means tab 

rule = '\t' + trim(rule); 
compilerlnfo ( compiler ] .rules 

.push_backlrule) ; 

1 

// Otherwise, it's a failure line: 

compilerlnfo [compiler] .fails . insert (s) ; 

1 

error ("CompileDB.txt", "Missing end tag"); 

■old CompilerData: laddFailures (CodeFileS cf) { 
3et<string>: [iterator it = 

_compilerNames . begin ( ) ; 
while (it != _compilerNames . end ( ) ) { 
if (compilerlnfo[*it] 

.fails .count (cf.rawName 1 ) ) != 0) 
cf .addFailure(*it) ; 



tring CompilerData :: ob j (string compiler) | 
if (compiler Info. count (compiler) ! = 0) { 
string ext ( 

compilerlnfo [ compiler ] . ob jExtens ion ) ; 
if (ext. length != 0) 

ext = '.' + ext; // Use '.' if it exis 



ring CompilerData :: exe (string compiler) | 
if (compiler Info. count (compiler) 1=0) ! 
string ext ( 

compilerlnfo [ compiler] . exeExtension ) ; 

if (ext. length () != 0) 
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return "No such compiler info 



oid CompilerData: : writeRules 1 
string compiler, ostreamS os ) { 
if (_compilerNames .count (compiler) ==0) { 
OS << "No info on this compiler" << endl 

} 

vector<string>S r = 

compilerlnfo [ compiler ] .rules; 
copy (r. begin 1) , r . end () , 



tring CompilerData :: ad justPath ( 
string compiler, string path) { 
// Use STL replace algorithm: 
if (compilerlnfo [ compiler] ._dos ) 

replace (path . begin ( ) , path. end () , '/', ' \ \ ' ) ; 
return path; 



oid CompilerData : : dump (ost reams os ) | 

*out++ = "Compiler Names:"; 
copy (_compilerNames . begin ( ) , 

_compilerNames . end ( ) , out ) ; 
map<string, CompilerData> : :iterator compit; 
for (compit = compilerlnfo . begin ( ) ; 

compit != compilerlnfo . end () ; compIt++) { 

OS << "Compiler: [" << (* compit) .first << 

"]" « endl; 
CompilerDataS cd = (*compIt) .second; 
OS << "ob jExtension: " << cd . ob jExtens ion 

<< "\nexeExtension: " << cd . exeExtens ion 

« endl; 

copy (cd. rules .begin ( ) , cd. rules .end ( ) , out) ; 
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cout << "Won't compile with: " << endl; 
copy (cd. fails .begin ( ) , cd. fails .end ( ) , 



// Manage makefile creation 

// Create the makefile for this directory, based 

// on each of the CodeFile entries: 

class Makefile | 

vector<CodeFile> codeFiles; 
// All the different paths 
// (for creating the Master makefile) : 

void 

createMakefile ( string compiler, string path); 
public: 

Makefile () { } 

void addEntry (CodeFiles cf) | 

paths. insert (cf. path ) ; // Record all paths 

// Tell it what compilers don't work with it 

CompilerData: :addFailures (cf ) ; 

codeFiles .push_back (cf ) ; 
) 

II Write the makefile for each compiler: 
void writeMakefiles (string path); 
// Create the master makefile: 
static void writeMaster ( string flag = ""); 



et<string> Makefile :: paths ; 

old Makefile: :writeMakefiles (string path) { 
if (trim(path) .lengthO == 0) 

return; // No makefiles in root directory 
PushDirectory pd(path); 

CompilerData: : compilerNames ( ) ; 
set<string> :: iterator it = compiler s . begin () ; 
while (it ! = compilers . end ( ) ) 

createMakefile ('■it + + , path) ; 
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oid Makefile: :createMakefile 1 
string compiler, string path) { 

exe (CompilerData: :exe (compiler) ) , 
ob j (CompilerData : : ob j (compiler ) ) ; 

string filename (compiler + ".makefile"); 

of stream makefile (filename . c_str ( ) ) ; 

makefile « 

# From Thinking in C++, 2nd Edition\n" 

# At http: //www.BruceEckel .com\n" 

# (c) Bruce Eckel 1999\n" 

# Copyright notice in Copyright . txt\n" 

# Automatically-generated MAKEFILE \n" 

# For examples in directory "+ path + "\n" 

# using the " + compiler + " compiler\n" 

# Note: does not make files that will \n" 

# not compile with this compiler\n" 

# Invoke with: make -f " 
compiler + " . makef ile\n" 



« endl; 
CompilerDa 



akefile) ; 



iteRules (compile 
vector<string> makeAll, makeTest, 

makeBugs, makeDeps, linkCmd; 
// Write the "all" dependencies: 
makeAll. push_back( "all: ") ; 
makeTest. push_back("test: all "); 
makeBugs . push_back ( "bugs : " ) ; 

vector<CodeFile>: : iterator it; 
for (it = codeFiles .begin 1 ) ; 

it != codeFiles. endl) ; it++) { 

CodeFiles cf = *it; 

if (cf .targetTypeO == executable) { 

line = "\\\n\t"+cf .targetNamel)+ exe + 
if (cf .compilesOK (compiler) == false) { 
makeBugs . push_back ( 

CompilerData: :adjustPath ( 
compiler, line) ) ; 
) else ! 

makeAll .push_back ( 

CompilerData : : ad justPath ( 
c omp iler,line) ) ; 
line = "\\\n\t" + cf . targetName ( ) + e 



' ' + cf .testArgs () + ' '; 
makeTest.push_back ( 

CompilerData : : ad justPath ( 
compiler, line) ) ; 

1 

// Create the link command: 

int linkdeps = cf . 1 ink () . s i ze () ; 

string linklist; 

for lint i = D; i < linkdeps; i + +) 

linklist += 

cf .linkl) .operator!] (i) + ob j + " " ; 
line = cf .targetName 1) + exe + ": " 

+ linklist + "\n\t$(CPP) S (OFLAG) " 

+ cf .targetName () + exe 

+ ' • + linklist + "XnXn"; 
linkCmd.pi]sh_back ( 

CompilerData: : ad justPath (compiler, line) ) ; 
1 

// Create dependencies 
if (cf .targetType == executable 
II cf. targetType == object) { 
int compiledeps = cf . compile ( ) . s i ze ( ) ; 
string ob jlist (cf . base () + ob j + ": "); 
for (int i = 0; i < compiledeps; i + +) 



objlist += 

cf .compile 1) .operator [] (i) 
lakeDeps .push_back ( 

CompilerData: : ad justPath ( 

)iler, objlist) +"\n"); 






ikefile, "") ; 



// The "all" target: 

copy (makeAll .begin () , makeAll .end ( ) , mkos) ; 

*mkos++ = "\n\n"; 

// Remove continuation marks from maks 

vector<string>: :iterator si = makeTesI 

int bsl; 

for(; si != makeTest.endO; si + +) 

if((bsl= (*si) .find("\\\n") ) != str: 

// Now print the "test" target: 
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copy ImakeTest. begin 1) , makeTest . end ( ) , mkos); 

// The "bugs" target: 

copy (makeBugs .begin () , makeBugs .end ( ) , mkos) ; 

if (makeBugs. size () == 1) 

*mkos++ = "\n\tgecho No compiler bugs in " 
"this directory! " ; 
*mkos++ = "\n\n"; 
// Link commands: 
copy (linkCmd.begin () , linkCmd.end ( ) , mkos) ; 

// Demendencies : 

copy (makeDeps .begin ( ) , makeDeps .end ( ) , mkos) ; 



oid Makefile: :writeMaster (string flag) { 
string filename = "makefile"; 
if (flag. length () != 0) 

filename += ' . ' + flag; 
of stream makefile (filename . c_str () ) ; 
makefile « "# Master makefile for " 

Thinking in C++, 2nd Ed. by Bruce Eckel\n" 

# at http://www.BruceEckel.com\n" 

# Compiles all the code in the book\n" 

# Copyright notice in Copyright . txt\n\n" 
help: \n" 

\taecho To compile all programs from \n" 
\taecho Thinking in C++, 2nd Ed., type\n" 
\tSecho one of the following commands, \n" 
\t@echo according to your compiler : \n" ; 
<string>S n = CompilerData : : compilerNames ( 



for(nit = n.beginO; nit != n . end ( ) ; nit + 
makefile « 

makefile « endl; 
// Make for each compiler: 

for(nit = n.beginO; nit != n . end ( ) ; nit + 
makefile « *nit « ":\n"; 
for (set<string>: :iterator it = paths.be 
it != paths. endO; it + +) { 
// Ignore the root directory: 
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if 1 C-it) .length 1) == 0) continue; 
makefile « "\tcd " « '"it; 
// Different commands for unix vs. do 
if (CompilerData: :isUnix C-nit) ) 
makefile « " ; " ; 

makefile « "XnXt"; 
makefile « "make -f " « *nit 

« ".makefile"; 
if (f lag. length != 0) { 

makefile « ' '; 

if (flag == "bugs") 
makefile « "-i "; 

makefile « flag; 
) 

makefile « "\n"; 
if (CompilerData: : i sUnix ( *nit ) == falsi 

makefile « "\tcd . . \n" ; 



if(argc < 2) { 

error ("Command line error", usage); 

exit (1) ; 
1 

// For development S testing, leave off no 
if (argc == 3) 

if (string (argv[2] ) == " -nocopyr ight" ) 
copyright = ""; 
// Open the input file to read the compile 
// information database: 
ifstream in(argv[l] ) ; 
ifdin) { 

error (string ("can't open ") + argv[l],us. 

exit (1) ; 



} 



mg 



while (getline (in, s) ) { 

// Break up the strings to preve. 
// this code is seen by this proi 
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if (s .find ("#: " " : CompileDB . txt" ) 
!= string: :npo3) { 

// Parse the compiler informatio 
CompilerData: : readDB (in) ; 
break; // Out of while loop 



if (in.eof 0) 

error ("CompileDB.txt", "Can't find da 
in.3eekg(0, io3::beq); // Back to begin. 
mapOtring, Makefile> makeFiles; 
while (getline (in, a)) { 

// Look for tag at beginning of line: 
if (s.findCV/" '■:'■) == 

I I S.findCV*" ":") == 
I I s.findl"#" ":") == 0) { 
CodeFile cf (in, s) ; 

cf. write 1); // Tell it to write it 
makeFiles [cf .path () ] . addEntry (cf ) ; 



II Write all the makefiles, telling each 
// the path where it belongs: 
map<string, Makefile>: :iterator mfi; 
f or (mf i = makeFiles. beginO ; 

mfi != makeFiles. end() ; mfi++) 
(*mfi) . second. writeMakefiles ( l*mfi) .first) ; 
// Create the master makefile: 
Makefile: :writeMaster () ; 

// Write the makefile that tries the bug files: 
Makefile: :writeMaster ("bugs" ) ; 
} ///:- 

The first too! you see is triin( ), which was lifted from the strings chapter earlier in the book, 
ft removes the whitespace from both ends of a string object. This is followed by the usage 
string which is printed whenever something goes wrong with the program. 

The error( ) function is global because it uses a frick with static members of functions. 
error( ) is designed so that if it is never called, no error reporting occurs, but if it is called one 
or more times then an error file is created and the total number of errors is reported at the end 
of the program execution. This is accomplished by creating a nested class ErrReport and 
making a static ErrReport object inside error( ). That way, an ErrReport object is only 
created the first time error( ) is called, so if error( ) is never called no error reporting will 
occur. ErrReport creates an ofstreain to write the errors to, and the ErrReport destructor 
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closes the ofstreain. then re-opens it and dumps it to cerr. This way, if the error report is too 
long and scrolls off the screen, you can use an editor to look at it. The count of the number of 
errors is held in ErrReport, and this is also reported upon program termination. 

The job of a PushDi rectory object is to capture the current directory, then created and move 
down each directory in the path (the path can be arbitrarily long). Each subdirectory m the 
file's path description is separated by a ':' and the nikdir( ) and chdir( ) (or the equivalent on 
your system) are used to move into only one directory at a time, so the actual character that's 
used to separate directory paths is safely ignored. The destructor returns the path to the one 
that was captured before all the creating and moving took place. 

Unfortunately, there are no functions in Standard C or Standard C++ to control directory 
creation and movement, so this is captured in the class OSDirControl. After reading the 
design patterns chapter, your first impulse might be to use the full "Bridge" pattern. However, 
there's a lot more going on here. Bridge generally works with things that are aheady classes, 
and here we are actually creating the class to encapsulating operating system directory 
control. In addition, this requires #ifdefs and #incliides for each different operating system 
and compiler. However, the basic idea is that of a Bridge, since the rest of the code 
(PushDi rectory is actually the only thing that uses this, and thus it acts as the Bridge 
abstraction) treats an OsDirControl object as a standard interface. 

All the information about a particular source code file is encapsulated in a CodeFile object. 
This includes the type of target the file should produce, variations on the name of the file 
including the name of the target file it's supposed to produce. The entire contents of the file is 
contained in the vector<striDg> lines. In addition, the file's dependencies (the files which, if 
they change, should cause a recompilation of the current file) and the files on the linker 
command line are also vector<string> objects. The CodeFile object keeps all the compilers it 
won't work with in _noBuild, which is a sel<string> because it's easier to look up an 
element in a set. The writeTa^ flag indicates whether the beginning and ending markers 
from the book listing should actually be output to the generated file. 

The three private helper functions target( ), headerLine( ) and dependLine( ) are used by the 
CodeFile constructor while it is parsing the input stream. In fact, the CodeFile constructor 
does much of the work and most of the rest of the member functions simply return values that 
are stored in the CodeFile object. Exceptions to this are addFailure( ) which stores a 
compiler that won't work, and coinpiiesOK( ) which, when given a compiler tells whether 
this file will compile successfully with that compiler. The ostream operator« uses the STL 
copy( ) algorithm and write( ) uses operator« to write the file into a particular directory and 
file name. 

Looking at the implementation, you'll see that the helper fiinctions larget( ), headerLine( ) 
and depeiidLine( ) are just using string functions in order to search and manipulate the lines. 
The constructor is what mitiates everything. The idea is that the main program opens the file 
and reads it until it sees the starting marker for a code file. At that point it makes a CodeFile 
object and hands the constructor the istream (so the constructor can read the rest of the code 
file) and the first line that was already read, since it contains valuable information. This first 
line is dissected for the file name information and the target type. The beginning of the file is 
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written (source and copyright information is added) and the rest of the file is read, until the 
ending tag. The top few lines may contain information about link dependencies and command 
line arguments, or they may be files that are #iiKluded using quotes rather than angle 
brackets. Quotes indicate they are from local directories and should be added to the makefile 
dependency. 

You' II notice that a number of the markers strings in this program are broken up into two 
adjacent character strings, relying on the preprocessor to concatenate those strings. This is to 
prevent them from causing the ExtractCode program from accidentally mistaking the strings 
embedded in the program with the end marker, when ExtractCode is extracting it's own 

The goal of CompilerData is to capture and make available all the information about 
particular compiler idiosyncrasies. At first glance, the CompilerData class appears to be a 
container of static member functions, a library of functions wrapped in a class. Actually, the 
class contains two static data members; the simpler one is a set<string> that holds all the 
compiler names, but compile rlnfo is a map that maps string objects (the compiler name) to 
CompilerData objects. Each individual CompilerData object in compilerlnfo contains a 
¥ector<string> which is the "rules" that are placed in the makefile (these rules are different 
for different compilers) and a sel<string> which indicates the files that won't coiiq>ile with 
this particular compiler. Also, each compiler creates different extensions for object files and 
executable files, and these are also stored. There are two flags which indicate if this is a "dos" 
or "Unix" style environment (this causes differences in path information and command styles 
for the resulting makefiles). 

The member fiinction readDB () is responsible for taking an istreain and parsing it into a 
series of CompilerData objects which are stored in compilerlnfo. By choosing a relatively 
simple format (which you can see in Appendix D) the parsing of this configuration file is 
fairly simple: the first character on a line determines what information the line contains; a '#' 
sign is a comment, a '{' indicates that the next compiler configuration is beginning and this is 
the new compiler name, a '(' is used to establish the object file extension name, a '&' 
indicates the "dos" or "Unix" directive, and '@' is a makefile rule which is placed verbatim at 
the beginning of the makefile. If there is no special character at the beginning of the line, the 
it must be a file that fails to compile. 

The addFailures( ) member function takes it's CodeFile argument (by reference, so it can 
modify the outside object) and checks each compiler to see if it works with that particular 
code file; if not, it adds that compiler to the CodeFile object's failure list. 



Both obj{ ) and exe( ) return the appropriate file extension for a particular compiler. Note that 
some situations don't expect extensions, and so the '.' is added only if there is an extension. 

When the makefile is being created, one of the first things to do is add the various make rules, 
such as the prefixes and target rules (see Appendix D for examples). This is accomplished 
with writeRules( ). Note the use of the STL copy( ) algorithm. 

Although dos compilers have no trouble with forward slashes as part of the paths of #incliide 
files, most dos make programs expect backslashes as part of paths in dependency lists. To 
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adjust for this, the acljustPath( ) function checks to see if this is a dos compiler, and if so it 
uses the STL replace( ) algorithm, treating the path string object as a container, to replace 
forward-slash characters with backward slashes. 

The last class. Makefile, is used to create all the makefiles, including the master makefile that 
moves into each subdirectory and calls the other makefiles. Each Makefl]e contains a group 
of CodeFlle objects, stored in a vector. You call addEntry( ) to put a new CodeFlle into the 
Makefile; this also adds the failure list to the CodeFile. In addition, there is a static 
set<string> which contains all the different paths where all the different makefiles will be 
written; this is used to build the master makefile so it can call all the makefiles in all the 
subdirectories. The addEntry( ) function also updates this set of paths. 

To write the makefile for a particular path (once the entire book file has been read), you call 
n'riteMakefiles( ) and hand it the path you want it to write the makefile for. This function 
simply iterates through all the compilers in compilers and calls createMakenie( ) for each 
one, passing it the compiler name and the path. The latter function is where the real work gets 
done. First the file name extensions are captured into local string objects, then the file name 
is created from the name of the compiler with ".makefile" concatenated (you can use a file 
with a name other than "makefile" by using the make -f flag). After writing the header 
comments and the rules for that particular compiler/operating-system combination 
(remember, these rules come from the compiler configuration file), a vector<string> is 
created to hold all the different regions of the makefile: the master target list makeAll, the 
testing commands makeTest, the dependencies makeDeps, and the commands for linking 
into executables linkCmd. The reason if s necessary to have lists for these four regions is that 
each CodeFile object causes enfries into each region, so the regions are built as the list of 
CfideFiles is traversed, and then finally each region is written in its proper order. This is the 
function which decides whether a file is going to be included, and also calls adjustPath( ) to 
conditionally change forward slashes to backward slashes. 

To write the master makefile in writeMaster( ), the initial comments are written. The default 
target is called "help," and it is used if you simply type make. This provides very simple help 
to the first time user, including the options for make that this makefile supports (that is, all the 
different compilers the makefile is set up for). Then it creates the list of commands for each 
compiler, which basically consists of: descending into a subdirectory, call make (recursively) 
on the appropriate makefile in that subdirectory, and then rising back up to the book's root 
subdirectory. Makefiles in Unix and dos work very differently from each other in this 
situation: in Unix, you cd to the directory, followed by a semicolon and then the command 
you want to execute - returning to the root directory happens automatically. While in dos, you 
must cd both down and then back up again, all on separate lines. So the writeMaster( ) 
function must interrogate to see if a compiler is running under Unix and write different 
commands accordingly. 

Because of the work done in designing the classes (and this was an iterative process; it didn't 
just pop out this way), iiiain( ) is quite straightforward to read. After opening the input file, 
the getUne( ) function is used to read each input line until the line containing CompileDB.txt 

is found; this indicates the beginning of the compiler database listing. Once that has been 



parsed, seekg< ) is used to move the file pointer back to the beginning so all the code files can 
be extracted. 

Each line is read and if one of the start markers is found in the line, a CodeFile object is 
created using that line (which has essential information) and the input stream. The constructor 
returns when it finishes reading its file, and at that point you can turn around and call \¥rite( ) 
for the code file, and it is automatically written to the correct spot (an earlier version of this 
program collected all the CfideFile objects first and put them in a container, then wrote one 
directory at a time, but the approach shown above has code that's easier to understand and the 
performance impact is not really significant for a tool like this. 

For makefile management, a inap<string, Makefile> is created, where the string is the path 
where the makefile exists. The nice thing about this approach is that the Makerde objects will 
be automatically created whenever you access a new path, as you can see in the line 

I makeFiles [cf .path ( ) ] .addEntry (cf ) ; 

then to write all the makefiles you simply iterate through the makeFiles map. 



Debugging 



assert( ) 

The Standard C library iss(rl| | macro is brief, to the point and portable. In addition, when 
you're finished debugging you can remove all the code by defining NDEBUG, either on the 
command-line or in code. 

Also, assert( ) can be used while roughing out the code. Later, the calls to assert( ) that are 
actually providing information to the end user can be replaced with more civilized messages. 

Trace macros 

cout or to a trace file. Here's a preprocessor macro to accomplish this: 

I #define TIUiCE(ARG) cout << #ARG << endl ; ARG 

Now you can go through and surround the statements you trace with this macro. Of course, it 
can introduce problems. For example, if you take the statement: 

I for(int i = 0; i < 100; i++) 
cout « i « endl; 

And put both lines inside TRACE( ) macros, you get this: 



TRACE (for lint i = 0; i < 100; i + +}) 
TRACE 1 cout << i << endl ; ) 

Which expands to this: 

cout « "for lint i = 0; i < 100; i + +)" « endl; 

for(int i = 0; i < 100; i++) 

cout « "cout « i « endl;" « endl; 
cout « i « endl; 

Which isn't what you wanl. Thus, this technique must be used carefully. 

A variation on the TRACE( ) macro is this: 

I #define Dla) cout « #a "=[" « a « "]" « nl ; 

If there's an expression you want to display, you simply put it inside a call to D() and the 
expression will be printed, followed by its value (assuming there's an overloaded operator « 
for the result type). For example, you can say D(a + b). Thus you can use it anytime you want 
to test an intermediate value to make sure things are OK. 

Of course, the above two macros are actually just the two most fundamental things you do 
with a debugger: trace through the code execution and print values. A good debugger is an 
excellent productivity tool, but sometimes debuggers are not available, or it's not convenient 
to use them. The above techniques always work, regardless of the si 



Trace file 



J la cout into the file. All you have to do is #deflne TRACEON and include the header file 
(of course, it's fairly easy just to write the two key lines right into your file): 

//: C10:Trace.h 
// Creating a trace file 
lifndef TIUiCE_H 
Idefine TIUiCE_H 
linclude <fstream> 

#ifdef TRACEON 

of stream TRACEFILE 1" TRACE . OUT" ) ; 

#define cout TRACEFILE 

#endif 

#endif // TRACE_H ///:- 

Here's a simple test of the above file: 

I // : CIO :Tracetst.cpp 
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// Test of trace. h 
#include ". ./require. h" 
#include <io3tream> 
#include <fstream> 

using namespace std; 

Idefine TRACEON 
#include "Trace. h" 

int mainl) { 

ifstream f ( "Tracetst . cpp" ) ; 

assure(f, "Tracetst . cpp" ) ; 

cout « f .rdbuf () ; 
} III:- 
is also uses the assure( ) function defined eai 



Abstract base class for debugging 

dm isd redefine ibe deboEpi; fu nctioii t. A 11 o b jeeli in tlie systeiD v ill tben liive 
debngein; finclioni miliblt. 

Tracking new/delete & malloc/free 

C on I on problem s i' illi n en or) tllociiion indiiiit ciliinj d tlelt for things you have 
nialloced, calling free for things you allocated with new, forgetting to release objects from 
the free store, and releasing them more than once. This section provides a system to help you 
track these kinds of problems down. 

To use the memory checking system, you simply link the obj file in and all the calls to 
nialloc(), realloc( ), calloc(), free(), new and delete are intercepted. However, if you also 
include the following file (which is optional), all the calls to new will store information about 
the file and line where they were called. This is accomplished with a useof thep^tjcemeiif 
jynfoj: for operalornew (this trick was suggested by Reg Chamey of the C-H- Standards 
Committee). The placement syntax is intended for situations where you need to place objects 
at a specific point in memory. However, it allows you to create an operator new with any 

number of arguments. This is used to advantage here to store the results of the FILE and 

LINE macros whenever new is called: 

// : CIO :MemCheck.h 

// Memory testing system 

// This file is only included if you want to 

// use the special placement syntax to find 



Appendix B: Progra 



// out the line number where "new" was called. 

#ifndef MEMCHECK_H 

Idefine MEMCHECK_H 

#include <cstdlib> // size_t 

// Use placement syntax to pass extra arguments. 
// From an idea by Reg Charney : 
void*- operator new ( 

std::size_t sz, char* file, int line); 
#define new new ( FILE , LINE ) 

#endif // MEMCHECK_H ///:- 

In the following file containing the function definitions, you will note that everything is done 
with standard 10 rather than iostreams. This is because, for example, the cout constructor 
allocates memory. Standard 10 ensures against cyclical conditions that can lock up the 
item. 

// : CIO :MemCheck.cpp {0} 

// Memory allocation tester 

linclude <cstdlib> 

linclude <cstring> 

linclude <cstdio> 

using namespace std; 

// MemCheck.h must not be included here 

// Output file object using cstdio 
// (cout constructor calls malloc ( ) ) 
class OFile { 

FILE* f; 
public: 

OFile (char* name) : f (fopen (name, "w")) {] 

-OFileO ! fclose(f); 1 

operator FILE*1) { return f; ) 

}; 

extern OFile memtrace; 

// Comment out the following to send all the 

// information to the trace file: 

#define memtrace stdout 

const unsigned long _pool_sz = 50000L; 
static unsigned char _memory_pool [_pool_s z ] ; 



oid* getmemlsize_t sz) { 
if l_memory_pool + _pool_sz - _pool_ptr < sz) | 
fprintf (stderr, 

"Out of memory. Use bigger model\n"); 
exit (1) ; 
1 

void*- p = _pool_ptr; 
_pool_ptr += sz; 
return p; 



// Holds information about allocated po 

class MemBag { 

public: 

enum type | Malloc, New } ; 
private: 

char* typestr (type t) { 
switch (t) ! 

case Malloc: return "malloc"; 
case New: return "new"; 
default: return "?unknown?"; 



itruct M { 

void* mp; // Memory pointer 

type t; // Allocation type 

char* file; // File name where allocated 

int line; // Line number where allocated 

Mlvoid* V, type tt, char* f, int 1) 

: mp(v), t(tt), file(f), lined) {1 



static const int increment = 50 ; 
public: 

MemBag : vlO), szlO), next ( ) {] 
void* add (void* p, type tt = Malloc, 

char* s = "library", int 1 = 0) { 
if (next >= sz) { 

// This memory is never freed, so it 
// doesn't "get involved" in the test: 
const int memsize = sz * sizeof (M) ; 
// Equivalent of realloc, no registratio 
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void' p = getmem(memsize) ; 
iflv) memmovelp, v, mems i ze ) ; 
V = (M'-jp; 
memset lEv[next], D, 

increment * sizeof (M) ) ; 
1 

v[next++] = Mlp, tt, s, 1) ; 
return p; 
1 

// Print information about allocation: 
void allocation lint i) { 

fprintf (memtrace, "pointer %p" 
" allocated with %s", 
v[i] .mp, typestr (v[i] -t) ) ; 
if lv[i] .t == New) 

v[i] .file, v[i] .line) ; 
fprintf (memtrace, "Xn"); 
1 

void validate (void* p, type T = Malloc) | 
for lint i = 0; i < next; i + +) 
if lv[i] .mp == p) { 
if lv[i] .t != T) { 
allocation(i); 
fprintf (memtrace, 

"\t was released as if it were " 
"allocated with %s \n", typestr ( T )) ; 
1 
v[i] .mp =0; // Erase it 



'po 



ry li 



-MemBag ( ) { 

for lint i = 0; i < next; i + +) 
if lv[i] .mp != 0) { 
fprintf (memtrace, 
"pointer not released: "); 
allocation (i) ; 
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tern MemBag MEMBAG_; 

id*- malloclsize_t sz) { 

void*- p = getmemlsz) ; 

return MEMBAG_.add (p, MemBag: :Malloc) ; 



id*- calloclsize_t niim_elems, size_t elem_sz) { 
void* p = getmem(num_elem3 * elem_sz); 
memset(p, 0, num.elems * elem_3z); 

return MEMBAG_. add (p, MemBag: :Malloc) ; 



id* realloc (void* block, size_t sz) { 
void* p = getmem(sz); 
if (block) meininove(p, block, sz); 
return MEMBAG_. add (p, MemBag: :Malloc) ; 



oid free (void* v) { 
MEMBAG_. validate (v, MemBag: :Malloc) ; 



id* operator new(size_t sz) { 

void* p = getmem(sz); 

return MEMBAG_. add (p, MemBag: :Wew) ; 



operator new lsize_t sz, char* file, int line) { 
void* p = getmem(sz); 
return MEMBAG_. add (p, MemBag: :New, file, line) ; 



oid operator delete (void* v) { 
MEMBAG_. validate (v, MemBag: :New) ; 



MemBag MEMBAG_; 

// Placed here so the const 

// AFTER that of MEMBAG_ : 

#ifdef memtrace 

jfundef memtrace 
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OFile is a simple wrapper around a FILE*; the constructor opens the file and the destructor 
closes it. The operator FILE*( ) allows you to simply use the OFile object anyplace you 
would ordinarily use a FILE* (in the fprintf( ) statements in this example). The #deflne that 
follows simply sends everything to standard output, but if you need to put it in a trace file you 
simply comment out that line. 

Memory is allocated from an array called _memory_pool. The _pool_ptr is moved forward 
every time storage is allocated. For simplicity, the storage is never reclaimed, and rea]loc( ) 
doesn't try to resize the storage in the same place. 

All the storage allocation functions call getnieni( ) which ensures there is enough space left 
and moves the _pool_ptr to allocate your storage. Then they store the pointer in a special 

:r of class MemBag called MEMB AG_, along with pertinent information (notice the 
IS of operator new; one which just stores the pointer and the other which stores 
the file and line number). The MemBag class is the heart of the system. 

You will see many similarities to xbag in MemBag. A distinct difference is realloc( ) is 
replaced by a call to getmem( ) and memmove( ), so that storage allocated for (he MemBag 
is not registered. In addition, the type enum allows you to store the way the memory was 
allocated; the typestr( ) function takes a type and produces a string for use with printing. 

The nested struct M holds the pointer, the type, a pointer to the file name (which is assumed 
to be statically allocated) and the line where the allocation occurred, v is a pointer to an array 
of M objects - this is the array which is dynamically sized. 

The allocation( ) function prints out a different message depending on whether the storage 
was allocated with new (where it has line and file information) or mallfic( ) (where it 
doesn't). This function is used inside validate( ), which is called by free( ) and delete( ) to 
ensure everything is OK, and in the destructor, to ensure the pointer was cleaned up (note that 
in validate( ) the pointer value v[i].inp is set to zero, to indicate it has been cleaned up). 

The following is a simple test using the memcheck facility. The MemCheck^bj file must be 
linked in for it to work: 

// : CIO :MemTest.cpp 
//|L] MemCheck 
// Test of MemCheck system 
#include "MemCheck. h" 



td: :malloc (100) ; 



std: :freelx) ; 
new double; 
) ///:- 

The trace file created in MeinCheck.cpp causes the generation of one "pointer not in memory 
list" message, apparently from the creation of the file pointer on the heap. [[ This may not still 
be true — test it ]] 

CGI programming in C++ 

hjii n siiflM ' 1! l»P«Hiil Mil ili.!l,-(Mi i IK H . c ! i i . li is 1 . n ll i ( t. m, n e 

hypertext markup language (HTML) and placed on a central server machine where they are 
handed to anyone who asks. The documents are requested and read using a web browser that 
has been written or ported to each particular platform. 

Very quickly, just reading a document was not enough and people wanted to be able to collect 
information from the clients, for example to take orders or allow database lookups from the 
server. Many different approaches to client-side programming have been tried such as Java 
applets, JavaScript, and other scripting or programming languages. Unfortunately, whenever 
you publish something on the Internet you face the problem of a whole history of browsers, 
some of which may support the particular flavor of your client-side programming tool, and 
some which won't. The only reliable and well-established solution^^ to this problem is to use 
straight HTML (which has a very limited way to collect and submit information from the 
client) and common gateway interface (CGI) programs that are run on the server. The Web 
server takes an encoded request submitted via an HTML page and responds by invoking a 
CGI program and handing it the encoded data frx)m the request. This request is classified as 
either a "GET" or a "POST" (the meaning of which will be explained later) and if you look at 
the URL window in your Web browser when you push a "submit" button on a page you'll 
often be able to see the encoded request and information. 

CGI can seem a bit intimidating at first, but it turns out that it's just messy, and not all that 
difficult to write. (An innocent statement that's true of many things - after you understand 
them.) A CGI program is quite straightforward since it takes its input from environment 
variables and standard input, and sends its output to standard output. However, there is some 
decoding that must be done in order to extract the data that's been sent to you from the 
client's web page. In this section you'll get a crash course in CGI programming, and we'll 
develop tools that will perform the decoding for the two different types of CGI subm 



^' AcluaJly. Java Servlets look like a much better solution than CGI, but - al least al this 
writing - Servlets are still an up-and-coming solution and you're unlikely lo find Ihem 
provided by you]' lypical ISP. 



(GET and POST). These tools will allow you to easily write a CGI program to solve any 
problem. Since C++ exists on virtually all machines that have Web servers (and you can gel 
GNU C++ free for virtually any platform), the solution presented here is quite portable. 



Encoding data for CGI 



situ iiJ 111 10 1 C (i I ptogriiii , till HTM L -forni " lig is ustd. Ilii [ol!(i¥ ing '■in sim fit 
M L mt (on II Ids i fotn III it li i s o n e n sm-id |i n i Fit Id ilonj » lib i 'siibit il' biitloii: 

// : ! CIO : SimpleForm.html 

<HTML><HEAD> 

<TITLE>A simple HTML f orm</TITLE></HEAD> 

Test, uses standard html GET 

<Form method="GET" ACTION=" /cgi-bin/CGI_GET . exe" > 

<P>Fieldl: <INPUT TYPE = "text" NAME = "Fieldl" 

<p><input type = "submit" name = "submit" > </p> 

</Form></HTML> 

///■■" 

Everything between the <Form and the </Fonii> is part of this form {You can have multiple 
forms on a single page, but each one is controlled by its own method and submit button). The 
"method" can be either "get" or "post," and the "action" is what the server does when it 
receives the form data: it calls a program. Each form has a method, an action, and a submit 
button, and the rest of the form consists of input fields. The most commonly-used input field 
is shown here: a text field. However, you can also have things like check boxes, drop-down 
selection lists and radio buttons. 

CGI_GET.exe is the name of the executable program that resides in the directory that's 
typically called "cgi-bin" on your Web server. ^^ (If the named program is not in the cgi-bin 
directory, you won't see any results.) Many Web servers are Unix machines (mine runs 
Linux) that don't fraditionally use the .exe extension for their executable programs, but you 
can call the program anything you want under Unix. By using the .exe extension the program 
can be tested without change under most operating systems. 

If you fill out this form and press the "submit" button, in the URL address window of your 
browser you will see something like: 

I http: // www.pooh.com/cgi-bin/CGI_GET.exe7Fieldl- 



' Free Web servers are j'elatively comjTion and caji be foujid by browsiiig (he Internet; 
Apache, for example, is Ihe mosi populaj' Web server on (he hiternet. 
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(Without the line break, of course.) Here you see a little bit of the way that data is encoded to 
send to CGI. For one thing, spaces are not allowed (since spaces typically separate command- 
line arguments). Spaces are replaced by '+' signs. In addition, each field contains the field 
name (which is determined by the form on the HTML page) followed by an '=' and the field 
data, and terminated by a '&'. 

At this point, you might wonder about the '+', '=,' and '&'. What if those are used in the 
field, as in "John & Marsha Smith"? This is encoded to: 

I John+%2 6+Marsha+Smith 

That is, the special character is turned into a '%' followed by its ASCII value in hex. 
Fortunately, the web browser automatically performs all encoding for you. 



The CGI parser 



Tliert ut HI 111)' (iini pies o I C G I p rosniD s >' rilten using Slmdird C . gf ii; g m t g t fc r d c in ; 
Ihis ii Iht Slindiid C cig bt loiind virlgill) e v erp b t re . H o «' t (m, C M liis becoig e quite 
nbiqgilagi, espetiillj in Ike form o( tit C N Q C + + C o ig f iler^^ (g++) that can be downloaded 
free from the Internet for virtually any platform (and often comes pre-installed with operating 
systems such as Linux). As you will see, this means that you can get the benefit of object- 
oriented programming in a CGI program. 

Since what we're concerned with when parsing the CGI information is the field name-value 
pairs, one class (CGIpair) will be used to represent a single name-value pair and a second 
class (CGImap) will use CGIpair to parse each name-value pair that is submitted from the 
HTML form into keys and values that it will hold in a map of strings so you can easily fetch 
the value for each field at your leisure. 

One of the reasons for using C++ here is the convenience of the STL, in particular the map 
class. Since map has the operator[ ], you have a nice syntax for exfracting the data for each 
field. The map template will be used in the creation of CGImap, which you'll see is a fairly 
short definition considering how powerful it is. 

The project will start with a reusable portion, which consists of CGIpair and CGImap in a 
header file. Normally you should avoid cramming Ihis much code into a header file, but for 
these examples it's convenient and it doesn't hurt anything: 

I // : CIO :CGImap.h 

// Tools for extracting and decoding data from 
I // from CGI GETs and POSTs . 



GNU stands for "Gnu's Not Unb:." The project, created by the Free Software Foundation, 
was originally intended lo replace the Unlvi operating system with a free version of thai OS. 
Linux appears to have replaced this initiative, but the GNU tools have played an integral part 
in (he development of Linux, which cojnes packaged with jnany GNU cojnpoiienls. 



♦include <string> 
linclude <vector> 
linclude <io3tream> 
using namespace std; 

class CGIpair : public pair<string, string> { 
public: 

CGIpairO 1} 

CGIpair (string name, string value) { 
first = decodeUIlLString (name) ; 
second = decodeUIlLString (value } ; 
} 

// Automatic type conversion for boolean test: 
operator bool () const { 

return ( fi rst . length ( ) != 0); 
1 
private: 

static string decodeURlString ( string URLstr) { 
const int len = URLstr . length () ; 
string result; 

for lint i = 0; i < len; i + +) { 
iflURLstr[i] == ' + M 

else if lURlstr[i] == '%' ) { 
result += 

translateHex (URLstr [i +1]) * 16 + 
translateHex(UItLstr(i + 2]); 
i += 2; // Move past hex code 
1 else // An ordinary character 
result += URLstrIi]; 
1 

} 

// Translate a single hex character; used by 
// decodeURlString : 

static char translateHex ( char hex) { 
if (hex >= 'A') 

return (hex a Oxdf) - 'A' + 10; 

return hex - '0'; 
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// Parses any CGI query and turns it into an 
// STL vector of CGIpair which has an associative 
// lookup operator[] like a map. A vector is used 
// instead of a map because it keeps the original 
// ordering of the fields in the Web page form. 
class CGImap : public vector<CGIpair> { 

string gq; 

int index; 

void operator- (CGImapS) ; 
CGImap (CGImapE) ; 
public: 

CGImap (string query ) : index ( ) , gq ( query ) { 
CGIpair p; 

while 1 Ip = nextPair () ) != 0) 
push_backlp) ; 
1 

// Look something up, as if it were a map: 
string operator [ ] (const strings key) { 
iterator i = begin () ; 
whileli != end() ) | 

if 1 C-i) .first == key) 
return (--i) .second; 



return string(); // Empty string == not fo 

oid dump (ostreama o, string nl = "<br>") { 
forliterator i = beginl); i != end ( ) ; i++) 
o « C-i) .first « " = " 
« C-i) .second « nl; 



// Produces name-value pairs from the query 
// string. Returns an empty Pair when there's 
// no more query string left: 
CGIpair nextPair () { 
if (gq. length () == 0) 

return CGIpair (); // Error, return empty 
if lgq.findl' = ' ) == -1) 

return CGIpair () ; // Error, return empty 
string name = gq.substrlO, gq . find ('=')) ; 
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gq = gq. substr (gq.f ind 1 '=' ) + 1); 
string value = gq. substr (0, gq.fi. 
gq = gq. substr (gq.find( ' fi ' ) + 1); 
return CGIpair (name, value); 



// Helper 
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}; ///:- 

The CGIpair class starts out quite simply: it inherits from the standard library pair template 
to create a pair of strings, one for the name and one for the value. The second constructor 
calls the member function decodeURLString( ) which produces a string after stripping away 
all the extra characters added by the browser as it submitted the CGI request. There is no need 
to provide functions to select each individual element — because pair is inherited publicly, 
you can just select theflrst and second elements of the CGIpair. 

The operator bool provides automatic type conversion to bool. If you have a CGIpair object 
called p and you use it in an expression where a Boolean result is expected, such as 

I if Ip) { //. . . 
then the compiler will recognize that it has a CGIpair and it needs a Boolean, so it will 
automatically call operator bool to perform the necessary conversion. 

Because the siring objects take care of themselves, you don't need to explicitly define the 
copy-constructor, operator= or destructor - the default versions synthesized by the compiler 
do the right thing. 
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The remainder of the CGlpair class consists of the two methods clecodeURLString( ) and a 
helper member function traii5lateHex( ) which is used by decodeURLString( ). (Note that 
translateHexO does not guard agamstbad input such as "%lH.")decodeURLStrmg() 
moves through and replaces each '+' with a space, and each hex code (beginning with a '%') 
with the appropriate character. It's worth noting here and in CGImap the power of the string 
class - you can index into a string object using operator[ ], and you can use methods like 
find( ) and substring( ). 

CGImap parses and holds all the name-value pairs submitted from the form as part of a CGI 
request. You might think that anything that has the word "map" in it's name should be 
inherited from the STL map, but map has it's own way of ordering the elements it stores 
whereas here it's useful to keep the elements in the order that they appear on the Web page. 
So CGImap is inherited from vector<CGIpair>, and operator[ ] is overloaded so you get 
the associative -array lookup of a map. 

You can also see that CGImap has a copy -constructor and an operator=, but they're both 
declared as private. This is to prevent the compiler from synthesizing the two functions 
(which it will do if you don't declare them yourself), but it also prevents the client 
programmer from passing a CGImap by value or from using assignment. 

CGlmap's job is to take the input data and parse it into name-value pairs, which it will do 
with the aid of CGIpair (effectively, CGIpair is only a helper class, but it also seems to 
make it easier to understand the code). After copying the query string (you'll see where the 
query string comes from later) into a local siring object gq, the nextPair( ) member function 
is used to parse the string into raw name-value pairs, delimited by '=' and '&' signs. Each 
resulting CGIpair object is added to the vector using the standard vector: :push_back( ). 
When nextPair( ) runs out of input from the query string, it returns zero. 

The CGImap: :operator[ ] takes the brute-force approach of a linear search through the 
elements. Since the CGImap is intentionally not sorted and they tend to be small, this is not 
too terrible. The dump( ) function is used for testing, typically by sending information to the 
resulting Web page, as you might guess from the default value of nl, which is an HTML 
"break line" token. 

Using GET can be fme for many applications. However, GET passes its data to the CGI 
program through an envhonment variable (called QUERY_STRING), and operating systems 
typically run out of environment space with long GET strings (you should start worrying at 
about 200 characters). CGI provides a solution for this: POST. With POST, the data is 
encoded and concatenated the same way as with GET, but POST uses standard input to pass 
the encoded query string to the CGI program and has no length limitation on the input. All 
you have to do in your CGI program is determine the length of the query string. This length is 
stored in the environment variable CONTENT_LENGTH. Once you know the length, you 
can allocate storage and read the precise number of bytes fixjm standard input. Because POST 
is the less-fragile solution, you should probably prefer it over GET, unless you know for sure 
that your input will be short. In fact, one might surmise that the only reason for GET is that it 
is slightly easier to code a CGI program in C using GET. However, the last class in 



CGlmap.h is a tool that makes handling a POST just as easy as handling a GET, which 
means you can always use POST. 



The class Post inherits from a string and only has a constructor. The job of the ci 
to get the query data from the POST into itself (a string). It does this by reading the 
CONTENT_LENGTH environment variable using the Standard C library function getenv( ). 
This comes back as a pointer to a C character string. If this pointer is zero, the 
CONTENT_LENGTH environment variable has not been set, so something is wrong. 
Otherwise, the character string must be converted to an integer using the Standard C library 
function atoi( ). The resulting length is used with new to allocate enough storage to hold the 
query string (plus its null terminator), and then read( ) is called for cin. The read( ) function 
takes a pointer to the destination buffer and the number of bytes to read. The resulting buffer 
is inserted into the current string using string: :app end (). At this point, the POST data is just 
a string object and can be easily used without further concern about where it came from. 



Testing the CGI parser 



w Ihanlie basic fools are defined, rhey can easily be used in a CGI rrograra like the 
low in g which simply dumps the name -value pairs that are parsed from a GET query. 
member that an iterator for a CGImap returns a CGIpair object when it is dereferenced, 
you must select the lirst and second parts of that CGIpair: 

/ / : CIO: CGI_GET . cpp 

// Tests CGImap by extracting the information 
// from a CGI GET submitted by an HTML Web page. 
#include "CGlmap.h" 

int mainl) { 

// You MUST print this out, otherwise the 
// server will not send the response: 
cout « "Content-type: text/plain\n" « endl; 
// For a CGI "GET," the server puts the data 
// in the environment variable QUERY_STRING : 
CGImap query (getenv("QUERY_STRING") ) ; 
// Test: dump all names and values 
for (CGImap: : iterator it = query . begin () ; 
it != query. end(); it++) { 
cout « (*it) .first « " = " 
« C-it) .second « endl; 
} 
} III-." 

When you use the GET approach (which is controlled by the HTML page with the METHOD 
tag of the FORM directive), the Web server grabs everything after the '?' and puis in into the 
operating- system environment variable QUERY_STRING. So to read that information all 
you have to do is get the QUERY_STRING. You do this with the standard C library function 



iv( ), passing it the identifier of the environment variable you wish to fetch. In niaiii( ), 
e how simple the act of parsing the QUERY_STRING is: you just hand it lo the 

ir for the CGImap object called query and all the work is done for you. Although 

an iterator is used here, you can also pull out the names and values from query using 

CGImap: :operator[ ]. 

Now it's important to understand something about CGI. A CGI program is handed its input in 
one of two ways: through QUER Y_STRING during a GET (as in the above case) or through 
standard input during a POST. But a CGI program only returns its results through standard 
output, via cout. Where does this output go? Back to the Web server, which decides what to 
do with it. The server makes this decision based on the contenl-type header, which means 
that if the content-type header isn't the first thing it sees, it won't know what to do with the 
data. Thus it's essential that you start the output of all CGI programs with the content-type 

In this case, we want the server to feed all the information directly back to the client program. 
The information should be unchanged, so the content-type is text/plain. Once the server sees 
this, it will echo all strings right back to the client as a simple text Web page. 

To test this program, you must compile it in the cgi-bin directory of your host Web server. 
Then you can perform a simple test by writing an HTML page like this: 



//: ! C10:GETtest.html 






<HTML><HEAD> 






<TITLE>A test of standard 


HTML 


GET</TITLE> 


</HEAD> Test, uses standard htD 


1 GET 


<Form method="GET" ACTION 


= "/cg 


-bin/CGI_GET.exe" 


<P>Fieldl: <INPUT TYPE = 


text 


NAME = "Fieldl" 


VALUE = "This is a test" size - 


"40"></p> 


<P>Field2: <INPUT TYPE = 


text 


NAME = "Field2" 


VALUE = "of the emergency 


siz« 


= "40"></p> 


<P>Field3: <INPUT TYPE = 


text 


NAME = "Field3" 


VALUE = "broadcast system 


siz€ 


= "40"></p> 


<P>Field4: <INPUT TYPE = 


text 


NAME = "Field4" 


VALUE = "this is only a test" 


ize = "40"></p> 


<P>Field5: <INPUT TYPE = 


text 


NAME = "Field5" 


VALUE = "In a real emerge 


icy" 


ize = "40"></p> 


<P>Field6: <INPUT TYPE = 


text 


NAME = "Field6" 


VALUE = "you will be inst 


cucted" size = "40"></p 


<p><input type = "submit" 


name 


= "submit" > </p> 



Of course, the CGI_GET.exe program must be compiled on some kind of Web server and 
placed in the correct subdirectory (typically called "cgi-bin" in order for this web page to 
work. The dominant Web server is the freely-available Apache (see http://www.Apache.org), 



which runs on virtually all platforms. Some word -processing/spreadsheet packages even come 
with Web servers. It's also quite cheap and easy to get an old PC and install Linux along with 
an inexpensive network card. Linux automatically sets up the Apache server for you, and you 
can test everything on your local network as if it were live on the Internet. One way or another 
it's possible to install a Web server for local tests, so you don't need to have a remote Web 
server and permission to install CGI programs on that server. 

One of the advantages of this design is that, now that CGIpair and CGlmap are defined, 
most of the work is done for you so you can easily create your own CGI program simply by 
modifying niain( ). 



Using POST 



T 1 e CGIpair and CGlmap from CGImap.h can be used as is for a CGI program that 
handles POSTs. The only thing you need to do is get the data from a Post object instead of 
from the QUERY_STRING environment variable. The following listing shows how simple i 
is to write such a CGI program: 

// : CIO :CGI_POST.cpp 

// CGlmap works as easily with POST as it 

// does with GET. 

linclude "CGImap.h" 

linclude <iostream> 

using namespace std; 

int mainO { 

cout « "Content-type: text/plain\n" « endl ; 
Post p; // Get the query string 
CGlmap query (p) ; 

// Test: dump all names and values 
for (CGlmap: : iterator it = query . begin () ; 
it != query. end(); it++) { 
cout « (*it) .first « " = " 
« (*it) .second « endl; 



After creating a Post object, the query string is no different from a GET query string, so it is 
handed to the constructor for CGlmap. The different fields in the vector are then available 
just as in the previous example. If you wanted to get even more terse, you could even define 
the Post as a temporary directly inside the constructor for the CGlmap object: 

I CGlmap query (Post () ) ; 

To test this program, you can use the following Web page: 



//: ! CIO :POSTtest. html 

<HTML><HEAD> 

<TITLE>A test of standard HTML POST</TITLE> 

</HEAD>Test, uses standard html POST 

<Form method="POST" ACTION=" /cgi-bin/CGI_POST . e 



<P>Fieldl: <INPUT TYPE = "text" 


NAME 


"Fieldl" 


VALUE = "This is a test" size = 


"40"></p> 


<P>Field2: <INPUT TYPE = "text" 


NAME 


"Field2" 


VALUE = "of the emergency" size 


= "40 


></p> 


<P>Field3: <INPUT TYPE = "text" 


NAME 


"Fields " 


VALUE = "broadcast system" size 


= "40 


></p> 


<P>Field4: <INPUT TYPE = "text" 


NAME 


"Field4" 


VALUE = "this is only a test" s 


Lze = 


40"></p> 


<P>Field5: <INPUT TYPE = "text" 


NAME 


"Field5" 


VALUE = "In a real emergency" s 


Lze = 


40"></p> 


<P>Field6: <INPUT TYPE = "text" 


NAME 


"Field6" 


VALUE = "you will be instructed 


' size 


= "40"></p 


<p><input type = "submit" name 


= "sub 


.it" > </p> 


</Form></HTML> 






///:- 







When you press the "submit" button, you'll get back a simple text page containing the parsed 
results, so you can see that the CGI program works correctly. The server turns around and 
feeds the query string to the CGI program via standard input. 



Handling mailing lists 



M inline II eiD lii llsl Is Ihe kind ol inoMem m iiif jieorli need to solve for tbcli W ib site. 

ilwiys tbe best. I leimed tbis tlic hitd w ))', lirsl try in • i (itiety of livi applets (w blcli soo e 
fitetr ills do not illt w ) ind even In iS c tipt (w Ucb isn 't sippoited iilform ly oi ill brow sets). 
Tbe tesiill of eicb eipe tli em w n i sindy stteii ofei ill ftoi ibe folks w to could i 't ;et il 
lo fork. V bei yon sel up i W eb site, y t ir goil sboiild be lo never ;et eiail fro in loyone 
CO n pinning rh it it d oesn 't w o rk , in d (he iiesl w ly lo prodiite tli Is risii It is to list pliin HTM L 
(iliict,i III I little woit, Ein be m ide to look qiiitt decent). 

Tbe second problem v is on tbe seiv er side . Id e illy , yo n 'd lite ill y o y r e m i il id d lesse i lo be 
idded md rem 0!ed from i single m islet file, b lit Ih is presen Is i p rob len . M o si open lln g 
syslem s illo¥ in oie lb in one pro 'nig to open i lile. W ben i clien 1 in 1 1 es i C G 1 reqii esl, tb e 
W eb setver stirts lip i new invocilion ol Ibe C G 1 p [obtain , an d since i W eb server cii bandle 
1 any requests It 1 llm e, tbis 1 cans tbal yon cu bive liny Inslinces of yoar C G I pro[rini 
rnnniij nonce. If ibe CGI protriB opens I specific file,lben y o » c m bi ve m iiy p tOErii s 
rnnnli; it once Ibil open tb 11 file. T bis Is 1 problem if Ibey ire eict reidiie md i tiling to 
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There may be a function for your operating system that "locks'" a file, so that other 
invocations of your program do not access the file at the same time. However, I took a 
different approach, which was to make a unique file for each client. Making a file unique 
quite easy, since the email name itself is a unique character string. The filename for each 
request is then just the email name, followed by the string ".add" or ".remove". The 
of the file is also the email address of the client. Then, to produce a list of all the nai 
add, you simply say something like (in Unix): 

I cat '-.add > addlist 

(or the equivalent for your system). For removals, you say: 



Once the names have been combined into a list you can archive or remove the files. 

The HTML code to place on your Web page becomes fairly straightforward. This particular 
example takes an email address to be added or removed from my C++ mailing list: 



<hl align="center"><font color=" #000000 " 

The C++ Mailing List</f ont></hl> 



table border="l" cellpadding="4 " 
ellspacing="l" width="550" bgcolor=" IFFFFFF" > 

<td width = "30" bgcolor = "#FFOOOO">S!ibsp; </td> 

<td align="center" width="422" bgcolor=" #0 "> 

<form action="/cgi-bin/mlm.exe" method="GET"> 

< input type = "hidden" name = "sub ject-field" 

value="cplusplus-email-list"> 

< input type= "hidden" name=" command-field" 

value="add"><p> 

name="email-address"> 



inp 


ut type="3 


ubmit" nan 


e="subm 


t 










alu 


e="Add 


Add 


ress to C++ Maili 


ig 


Li 


3t 


> 




/P> 


</form 


></t 


d> 














td 


width= 


30' 


bgcolor= 


#FFOOOO 


> 


nb 


= P 


< 


td> 


r> 




















> 




















td 


width= 


30' 


bgcolor^ 


#000000 


> 


nb 


3P 


< 


td> 


td 


align= 


cen 


ter" width="422" 













bgcolor="#FFOOOO"> 

<forin action="/cgi-bin/mlm.exe" method="GET" 

< input type = "hidden" name = " sub ject-field" 



value="cplusplus-email-list"> 

< input type= "hidden" name=" command-field" 

<input type="text" size="40" 

name="email-address"> 

< input tYpe=" submit" name=" submit" 

value="Remove Address From C++ Mailing List"> 

</p></forin></td> 

<td width="30" bgcolor=" #000000">Snbsp; </td> 

</table> 

Each form contains one data-entry field called email -address, as well as a couple of hidden 
fields which don't provide for user input but carry information back to the server nonetheless. 
The subject-field tells the CGI program the subdirectory where the resulting file should be 
placed. The command-field tells the CGI program whether the user is requesting that they be 
added or removed from the list. From the action, you can see that a GET is used with a 
program called ndm.exe (for "mailing list manager"). Here it is: 

//: C10:mlm.cpp 

// A GGI program to maintain a mailing list 

#include "CGImap.h" 

#include <fstream> 



CO 


nst stri 


g contact ("BruceSEckelOb 


ects 


com") 


// 


Paths i 


this progr 


m are 


for Lir 


ux/Unix. Y 


// 


must us 


backslashe 


(two 


for each sir 


gle 


// 


slash) 


n Win32 ser 


ers: 








CO 


t main () 


g rootpath ( 

! 


/home 


/eckel/ 


) ; 






cout « 


Content-type: tex 


t/html\r 




ndl; 




CGImap query (getenv ( 


QUERY 


_STRING 


) ) ; 






if (query 


"test-field 


] = = 


■on") { 








cout << "map size: 




query. 3 


ze 






query .dump (cout, "<br>") 










if (query 


"subject-field"] . 




= 0) 


{ 




cout « "<h2>Incor 


ect f 


orm. Cor 


tact 





if (email. sizel) == 0) { 
cout « "<h2>Please en 



if (email. find_first_ofl" \t") != str ing : : npos ) | 
cout << "<h2>You cannot use white space " 

"in your email address" « endl ; 
return 0; 
1 
if (email. findCS' ) == str ing :: npos ) { 

cout << "<h2>You must use a proper email" 
" address including an '@' sign" « endl; 



if (email. findC .' ) == str ing :: npos ) { 

cout << "<h2>¥ou must use a proper email" 
" address including a '.'" « endl ; 

return 0; 
1 

string fname = email; 
if (query ["command-field"] == "add") 

fname += ".add"; 
else if (query["command-field"] == "remove") 

fname += ".remove"; 






nd-field not found. Co. 



string path ( rootpath + query [" sub ject-field" ] 

+ "/" + fname); 
of stream out (path . c_str ()) ; 
if (lout) ! 

cout << "cannot open " << path << "; Contact 
« contact « endl; 



out « email « endl; 

cout « "<br><H2>" « email « " has been "; 
if (query["command-field"] == "add") 
cout « "added"; 

else if (query ["command-field"] == "remove") 
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cout « "removed"; 
cout « "<br>Thank you</H2>" « endl ; 
} ///:- 

Again, all the CGI work is done by the CGIniap. From then on it's a matter of pulling the 
fields out and looking at them, then deciding what to do about it, which is easy because of the 
way you can index into a map and also because of the tools available for standard strings. 
Here, most of the programming has to do with checking for a valid email address. Then a file 
name is created with the email address as the name and ".add" o 
and the email address is placed in the file. 

Maintaining your list 

Once you have a listof n; 
you mightget some dupli 
may differ only by upper 
nam es from a file and pla 

//: C10:readLower.h 
// Read a file into a contai. 
// forcing each line to lowe 
#ifndef IffiADLOWER_H 
#define IffiADLOWER_H 
#include ".. /require . h" 
#include <io3tream> 
#include <fstream> 
#include <3tring> 
#incliide <algorithm> 
#include <cctype> 



dd.you 
you nee 

iito a CO 


s useful to 
nrainer of st 


z 


fore 


end of yo 
those. B 

1 that w 
ng allth 



std: itransformls. begin () , s . end () , 
s.beginl), downcase); 



template<cla33 S Container > 

void readLower (char* filename, SContainerS c) { 
std: :ifstream in ( filename ) ; 
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const int sz = 1024; 
char buf [sz] ; 

while (in.getline (buf, sz) ) 
// Force to lowercase: 
c.pi]sh_back (string (lease (buf) ) ); 
( 
#endif // READLOWER_H ///:- 

Since it's a template, it will work with any container of string that supports push_back< ). 
Again, you may want to change the above to the form readln(in, s) instead of using a fixed- 
sized buffer, which is more fragile. 

Once the names are read into the list and forced to lowercase, removing duplicates is trivial: 

// : CIO :RemoveDuplicates . cpp 

// Remove duplicate names from a mailing list 

#include "readLower . h" 

#include ".. /require . h" 

#include <vector> 

#include <algorithm> 

using namespace std; 

int main (int argc, char*" argv [ ] ) { 
requireArgs (argc, 2) ; 

readLower (argv[l], names); 

long before = names . size () ; 

// You must sort first for unique () to work: 

sort (names, begin 0, names, end () ); 

// Remove adjacent duplicates: 

unique (names. begin , names. end () ) ; 

long removed = before - names . s i ze () ; 

ofstream out(argv[2] ); 

assure (out, argv[2]); 

copy (names. begin , names. end(), 

cout « removed « " name 



A vector is used here instead of a list because sorting requires random-access which is much 
faster in a vector. (A list has a built-in sort( ) so that it doesn't suffer from the perfor 
that would result from applying the normal sort( ) algorithm shown above}. 



The sort must be performed so that all duplicates are adjacent to each other. Then uiiique( ) 
can remove all the adjacent duplicates. The program also keeps track of how many duplicate 
names were removed. 

When you have a file of names to remove from your list, readLower( ) comes in handy 

// : CIO :RemoveGroup.cpp 

// Remove a group of names from a list 

linclude "readLower . h" 

linclude ".. /require . h" 

linclude <list> 

using namespace std; 



requireArgs (argc, 3) ; 

readLower (argv[l], names); 

readLower (argv[2], removals); 

long original = names . size () ; 

Container: [iterator rmit = removal s . begin () ; 

while (rmit != removals . end () ) 

names. remove(*rmit++) ; // Removes all matches 
ofstream out(argv[3] ); 
assure (out, argv[3]); 
copy (names. begin , names. end (), 

ostream_iterator<string>(out, "\n") ) ; 
long removed = original - names . si ze () ; 
cout << "On removal list: " << removal s . s i ze ( ) 
« "\n Removed: " « removed « endl ; 
} ///:- 

Here, a list is used instead of a vector (since readLower( ) is a template, it adapts). Although 
there is a reniove( ) algorithm that can be applied to containers, the built-in list:: remove ( ) 
seems to work better. The second command-line argument is the file containing the list of 
names to be removed. An iterator is used to step through that list, and the list::remove( ) 
function removes every instance of each name from the master list. Here, the list doesn't need 
to be sorted first. 

Unfortunately, that's not all there is to it. The messiest part about maintaining a mailing list is 
the bounced messages. Presumably, you'll just want to remove the addresses that produce 
bounces. If you can combine all the bounced messages into a single file, the following 
program has a pretty good chance of extracting the email addresses; then you can use 
RemoveGroup to delete them from your list. 
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// : CIO :ExtractUndeliverable.cpp 

// Find undeliverable names to remove fn 

// mailing list from within a mail file 

// containing many messages 

linclude ".. /require . h" 

linclude <cstdio> 

linclude <string> 

#incliide <set> 

using namespace std; 

char* start_str[] = | 
"following address", 
"following recipient", 
"following destination", 
"undeliverable to the following", 
"following invalid". 



"Message-ID", 
"Please reply to". 



// The inl) function allows you to check whethe 
// a string in this set is part of your argumen 
class StringSet { 



public: 

bool in (char* s) { 

for lint i = 0; i < sz; i + +) 



// Calculate array length: 

#define ALEN (A) ((sizeof A)/lsizeof '"A)) 



Appendix B: Programming GuideHn 



nt main lint argc, char*" argv [ ] ) { 
requireArgs (argc, 2, 

"Usage:ExtractUndeliverable infile outf ile" ) ; 
FILE* infile = f open (argv [ 1 ] , "rb"); 
FILE* outfile = f open (argv [2 ] , "w"); 
require (infile != 0); require ( outf ile != 0); 
set<string> names; 
const int sz = 1024; 
char buf Isz]; 

while (fgets (buf, sz, infile) != 0) { 
if (starts. in (buf) ) { 
puts (buf); 

while (fgets (buf, sz, infile) != 0) | 
if (continues. inlbuf) ) continue; 

if (strstr (buf, " ") != 0) break; 

const char* delimiters- " \t<> ( ) : ; , \n\ " " ; 
char* name = strtok (buf , delimiters); 
while (name != 0) | 

if (strstr (name, "S") != 0) 

names. insert (string (name) ); 
name = 3trtok(0, delimiters); 
} 



} 

set<string>: :iterator i = names . begin () ; 

while (i != names. end ) 

fprintf (outfile, '■%s\n", ( * i + + ) . c_str ( ) ) ; 
) III:- 

The first thing you'll notice about this program is that contains some C functions, including C 
I/O. This is not because of any particular design insight. It just seemed to work when 1 used 
the C elements, and it started behaving strangely with C-H-I/O. So the C is just because it 
works, and you may be able to rewrite the program in more "pure C++" using your C++ 
compiler and produce correct results. 

A lot of what this program does is read lines looking for string matches. To make this 
convenient, I created a StringSet class with a member function in( ) that tells you whether 
any of the strings in the set are in the argument. The StringSet is initialized with a constant 
two-dimensional of strings and the size of that array. Although the StringSet makes the code 
easier to read, it's also easy to add new strings to the arrays. 



Both the input file and the output file in inain( ) are manipulated with standard I/O, since it's 
not a good idea to mix I/O types in a program. Each line is read using fgets( ), and if one of 
them matches with the starts StringSel, then what follows will contain email addresses, until 
you see some dashes (1 figured this out empirically, by hunting through a file full of bounced 
email). The continoes StringSet contains strings whose lines should be ignored. For each of 
the lines that potentially contains an addresses, each address is extracted using the Standard C 
Library function slrtok( ) and then it is added to the set<striDg> called names. Using a sel 
eliminates duplicates (you may have duplicates based on case, but those are dealt with by 
RenioveGroup.cpp. The resulting set of names is then printed to the output file. 

Mailing to your list 

just likes the simple approach of calling an external com ni and ("Fastni ail," which is part of 
Unis) using the Standard C library function systeni( ). The program spends all its time 
building the external command. 



When people don't want to be on a list anymore they will often ignore instructions and just 
reply to the message. This can be a problem if the email address they're replying with is 
different than the one that's on your list (sometimes it has been routed to a new or aliased 
address). To solve the problem, this program prepends the text file with a message that 
informs them that they can remove themselves from the list by visiting a URL. Since many 
email programs will present a URL in a form that allows you to just click on it, this can 
produce a very simple removal process. If you look at the URL, you can see it's a call to the 
mlm.exe CGI program, including removal information that incorporates the same email 
address the message was sent to. That way, even if the user just replies to the message, all you 
have to do is click on the URL that comes back with their reply (assuming the message is 
automatically copied back to you). 

//: CIO :Batchmail . cpp 

// Sends mail to a list using Unix fastmail 

#include ".. /require . h" 

#include <io3tream> 

#include <fstream> 

#include <string> 

#include <strstream> 

#include <cstdlib> // system ( ) function 



o 


mC'Br 




BSEckelObjects 


CO 


p 


lyto 1 


'B 


ruceSEckelObjects 


1 


ogfil 


e 1 


'BatchMail.log 


) ; 



requireArgs (argc, 2, 

"Usage: Batchmail namelist mailfile"); 
ifstream names (argv[l] ) ; 
assure (names, argv[l]); 

while (getline (names, name)) { 

ofstream msg ( "m . txf ' ) ; 

assure(msg, ■■m.tj.f), 

msg « "To be removed from this list, " 

"DO NOT IffiPLY TO THIS MESSAGE. Instead, \n 

"click on the following URL, or visit it " 

"using your Web browser. This \n" 

"way, the proper email address will be " 

"removed. Here's the UItL:\n" 

« "http://www.mindview.net/cgi-bin/" 

"mlm.exe?subject-field=workshop-email-list 

"Scommand-field=removeSemail-address=" 

<< name << " Ssubmit=submit\n\n" 

ifstream text largv[2] ) ; 

assure (text, argv[l]); 

msg « text.rdbufl) « endl ; 

string command ( "fastmail -F " + from + 
" -r " + replyto + " -s \"" + subject + 
"\" m.txt " + name); 

system (command. c_str() ) ; 

logfile « command « endl; 

static int mailcounter = 0; 

const int bsz = 25; 

char buf [bsz] ; 

// Convert mailcounter to a char string: 

mcounter « mailcounter++ « ends; 
if 1 H-+mailcounter % 500) == 0) { 

string command2 ( "fastmail -F " + from + 
" -r " + replyto + " -s \"Sent " + 
string (buf) + 

" messages \" m.txt eckel@aol.com"); 
systemlcommand2.c_str () ) ; 
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The first command-line argument is the list of email addresses, one per line. The names are 
read one at a lime into the string called name using getline( ). Then a temporary file called 
m.txt is created to build the customized message for that individual; the customization is the 
note about how to remove themselves, along with the URL. Then the message body, which is 
in the file specified by the second command-line argument, is appended to m.lxt. Finally, the 
command is built inside a string: the "-F' argument to fastmail is who it's from, the "-r" 
argument is who to reply to. The "-s" is the subject line, the next argument is the file 
containmg the mail and the last argument is the email address to send it to. 

You can start this program in the background and tell Unix not to stop the program when you 
sign off of the server. However, it takes a while to run for a long list (this isn't because of the 
program itself, but the mailing process). I like to keep track of the progress of the program by 
sending a status message to another email account, which is accomplished in the last few lines 
of the program. 

A general information-extraction 
CGI program 

One oftlie m« b leiD i w itb CGI is tbii yon ii ii si <» lite ind cctn pile t iit« proetitn tvMV lim e 
JO II « III 10 idd I new ficilil; to youi W eb site. H ow e v cr, i ucli of 111 e lln e ill tli it y o iir C G I 
pro en I does li ciplure iilotn ilion froi tie met lad sttie lloi ibe server. II you could ase 
hidden fields to specify wbiMo do w ilb tie In fori i lion, tbeo il «oiild be possible lo write i 
single C G I progrin lliit w egld ei Iricl tbe in lo rn ilio n Iroi in y C G I reqn est. T b Is 
iofori itien coiild be stored In i imilorni form it, In i sob directory specified by i hlddes field 
in tbe H TM L form , md In 1 file tb it In clii d ed Ibe user's en ill id d ress - c Icon lie, ii ibe 

sibi Isslonj so Ibe dile md tin e of the snbn ission cii be m ii;led in >' Itb the file mig e lo 

defiling the HTM L ind c re i ling i nen' subdirectory on y o iir sertei. F or ei in p le , e very tine I 
cone np w Itb i new c iiss o r >' o rksb o p , ill I b n e to do is ere He tbe HTM L torn for slgnaps - 
no C G I projrin n Ins is refljlred. 

Tie folio* Ins HTM L pije sbo* s tbe form 1 1 for III is scl era e . S inc ( i CGI f ST is m ore 
Seneril ind doesn't bive my lin it on Ibe ii ounl of inforn illon it cm send , it w ill ih lys be 
nsed insteidofiGET forllie Exiractlnfo.cpp program that will implement this system. 
Although this form is simple, yours can be as complicated as you need it. 

// : ( CIO :INF0te3t.html 

<html><head><title> 

Extracting information from an HTML POST</title> 

</head> 

<bodY bgcolor = "#FFFFFF" 1 ink = " #0 OFF" 

vlink="#800080"> <hr> 

<p>Extracting information from an HTML POST</p> 
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method="POST"> 

nput type="hidden" nami 

nput type="hidden" name="ri 
Remember your lunch!" 
type= "hidden" name="ti 



/Extractlnfo.e 
ame="subject-f 



value= 
<input 
value= 
<input 
value= 



eld" 



ype= "hidden" name= "mail -copy " 
ruceSEckelOb jects . com; eckelSao 
< input type= "hidden" name="conf irmatio. 

<p>Email address (Required) : <input 

</p>Comment : <br> 



<p><input type="submit" nam 


e="submi 


<input type="reset" name="r 


eset"</p 


</form><hr></body></html> 




///:- 





Right after the form's action statement, you see 

<input type="hidden" 

This means that particular field will not appear on the form that the user sees, but the 
information will still be submitted as part of the data for the CGI program. 

The value of this field named "subject-field" is used by Extractlnfo.cpp to determine the 
subdirectory in which to place the resulting file (in this case, the subdirectory will be "test- 
extract- info"). Because of this technique and the generality of the program, the only thing 
you'll usually need to do to start a new database of data is to create the subdirectory on the 
server and then create an HTML page like the one above. The Extractlnfo.cpp program will 
do the rest for you by creating a unique file for each submission. Of course, you can always 
change the program if you want it to do something more unusual, but the system as shown 
will work most of the time. 

The contents of the "reminder" field will be displayed on the form that is sent back to the user 
when their data is accepted. The "test-field" indicates whether to dump test information to the 
resulting Web page. If "mail-copy" exists and contains anything other than "no" the value 
string will be parsed for mailing addresses separated by ';' and each of these addresses will 
get a mail message with the data in it. The "email-address" field is required in each case and 
the email address will be checked to ensure that it conforms to some basic standards. 

The "confirmation" field causes a second program to be executed when the form is posted. 
This program parses the information that was stored from the form into a file, turns it into 
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hum an -re ad able form and sends an email message back to the client to confirm that their 
information was received (this is useful because the user may not have entered their email 
address correctly; if they don't get a confirmation message they'll know something is wrong). 
The design of the "confirmation" field allows the person creating the HTML page to select 
more than one type of confirmation. Your first solution to this may be to simply call the 
program directly rather than indirectly as was done here, but you don't want to allow someone 
else to choose - by modifying the web page that's downloaded to them - what programs they 
can run on your machine. 

Here is the program that will extract the information from the CGI request: 

// : CIO :ExtractInfo.cpp 

// Extracts all the information from a CGI POST 

// submission, generates a file and stores the 

// information on the server. By generating a 

// unique file name, there are no clashes like 

// you get when storing to a single file. 

linclude "CGImap.h" 

linclude <iostream> 

#include <fstream> 

#include <cstdio> 

using namespace std; 

const string contact ( "BrucegEckelOb jects . com" ) ; 
// Paths in this program are for Linux/Unix. You 
// must use backslashes (two for each single 
// slash) on Win32 servers: 
const string rootpath ( " /home/eckel / " ) ; 

void showlCGImapE m, ostreamS o); 

// The definition for the following is the only 



ore (CGImapS m. 


ostream 


t main () { 




cout « "Conter 


t-type: 


Post p; // Col] 


ect the 


CGImap query (p 


; 


// "test-field 


set to 


if (query["test- 


field"] 


cout « "map 


size: " 


query .dump (cc 


ut) ; 



if (query["subject-field"] .sizeO ==0) { 
cout « "<h2>Incorrect form. Contact " 



string email = query [" email-addres s "] ; 
if (email. sizel) == 0) { 

cout << "<h2>Please enter your email addn 



if (email. find_first_ofl" \t" ) != str ing : : npos ) | 

cout << "<h2>¥ou cannot include white space " 
"in your email address" « endl ; 

return 0; 
1 
if (email. findCE') == string :: npos ) { 

cout << "<h2>You must include a proper email" 
" address including an '@' sign" « endl; 

return 0; 
1 
if (email. find( '.' ) == str ing : : npos ) { 

cout << "<h2>You must include a proper email" 
" address including a '.'" « endl ; 

return ; 
1 

// Create a unique file name with the user's 
// email address and the current time in hex 
const int bsz = 1024; 
char fname[bsz] ; 

time (Snow); // Encoded date S time 

sprintf (fname, "%s%X.txt", email . c_str () , now); 

string path (rootpath + query [ "sub ject-field" ] + 

"/" + fname); 
of stream out (path . c_str ()) ; 
if (lout) ! 

cout << "cannot open " << path << "; Contact" 
« contact « endl; 



// Store the file and path info 
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out « "///{" « path « endl; 

// Display optional reminder: 

if (qiiery["reminder"] .sizeO != 0) 

cout <<"<H1>" << query ["reminder"] <<" 
show(query, cout); // For results page 
store (query, out); // Stash data in file 
cout << "<br><H2>Your submission has bee 



"pos 



s<br 



fnai 



ndl 



<< "<br>Thank you</H2>" << endl ; 

// Optionally send generated file as email 
// to recipients specified in the field: 
if (query ["mail -copy"] .length () ! = SE 
query ["mail-copy"] != "no") { 

string to = query [ "mail-copy "] ; 

// Parse out the recipient names, separated 

vector<string> recipients; 
int ii = to.findC; '); 
while(ii != string : :npos ) { 

recipients .push_back (to. substr (0, ii) ) ; 

to = to.substr(ii + 1) ; 

ii = to.findC; ') ; 
1 

recipients. push_back (to) ; // Last one 
// "fastmail" only available on Linux/Unix: 
forlint i = 0; i < recipients . si ze () ; i++) { 

query["subject-field"] + "\" " + 

path + " " + recipientsli] ); 



// Execute a confirmation program on the file. 
// Typically, this is so you can email a 
// processed data file to the client along with 
// a confirmation message: 

if (query ["confirmation"] .length () != 0) { 
string conftype = query [ "confirmation" ] ; 
iflconftype == "conf irmationl " ) { 

string command (" . /ProcessAppl ication . exe "+ 

path + " fi") ; 
// The data file is the argument, and the 
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(command. c_strl) ) ; 
logfileC'Extract.log"); 
am log(logfile.c_str()); 



1 



the html 



// For displaying the ir 
// results page: 
void show(CGImapS m, ost 
string nl ( '■<br>" ) ; 
o << "<h2>The data you 
« "</h2><br>" 

« "From[" « m["email-address"] « ']' «nl; 
for (CGImap: [iterator it = m.beginl); 



ed wa 



nd ( ) ; 



+ ) { 



tring name = (*it) . f i 
value = C-it) .second 
lil-addr. 



submit" SS 
;est-field" SE 



// Change this to . 
void store (CGImapE 



Lze the program: 
LreamS o, string nl ) { 
o « "From[" « m [ "email-addres s " ] « ']' «nl; 
for (CGImap: : iterator it = m.beginl); 
it != m.endO; it + +) { 
string name = (*it) .first, 
value = (*it) .second; 

1-address" SS 



submit" SS 
Lest-field" SS 



Appendix B: Programming GuideHn 



} III:- 

The program is designed to be as generic as possible, but if you want to change something it 
is most likely the way that the data is stored in a file (for example, you may want to store it in 
a comma-separated ASCII format so that you can easily read it into a spreadsheet). You can 
make changes to the storage format by modifying store( ), and to the way the data is 
displayed by modifying show( ). 

niain( ) begins using the same three lines you'll start with for any POST program. The rest of 
the program is similar to mlm.cpp because it looks at the "test-field" and "email-address" 
(checking it for correctness). The file name combines the user's email address and the current 
date and time in hex - notice that sprinlf( ) is used because it has a convenient way to convert 
a value to a hex representation. The entire file and path information is stored in the file, along 
with all the data from the form, which is tagged as it is stored so that it's easy to parse (you'll 
see a program to parse the files a bit later). All the information is also sent back to the user as 
a simply-formatted HTML page, along with the reminder, if there is one. If "mail-copy" exists 
and is not "no," then the names in the "mail-copy" value are parsed and an email is sent to 
each one containing the tagged data. Finally, if there is a "confirmation" field, the value 
selects the type of confirmation (there'sonly one type implemented here, but you can easily 
add others) and the command is built that passes the generated data file to the program (called 
FrocessApplication.exe). That program will be created in the next section. 

Parsing the data files 

y oil J OK tiv( 1 loi of diti files uciira ill liDf on yom W eb siti. is peop li sifii jp loi 
w liitedr (oii're oflerinj, H (le'i «■ til oiie of iliein ni iglillool; like 

//: ! CO? :Te3tData.txt 

///{/home/eckel/super-cplusplus-worksliop- 
regi strat ion /Bruce SEckelObjects . com3 5B58 9A0 . txt 
From[Bruce@EckelOb jects .com] 

[{ [subject-field] )] 
[ 1 [ 
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Mill Valle 



415-555-1212 



III:- 



This is a brief example, but there are as many fields as you have on your HTML form. Now, 
if your event is compelling you'll have a whole lot of these files and what you'd like to do is 
automatically extract the information from them and put that data in any format you'd like. 
For example, the ProcessApplication.exe program mentioned above will use the data in an 
email confirmation message. You'll also probably want to put the data in a form that can be 
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easily brought into a spreadsheet. So it makes sense to start by creating a gen era I -purpose tool 
that will automatically parse any file that is created by Extractlnfo.cpp: 

//: C10:FormOata.h 
#include <string> 
#include <io3tream> 
#include <fstream> 
#include <vector> 



aPair : public pair<string, 

rl) {) 

r (istreamS in) { get (in) ; } 

rS get (istreamS in) ; 

r bool 1) { 

n first. length ( ) ! = 0; 



ing filePath, 



FormData (char* fileNar 
void dump (ostreamS os 



The DalaPair class looks a bit like the CGIpair class, but it's simpler. When you create a 
DataPair. the constructor calls gel( ) to extract the nexl pair from the input stream. The 
operator bool indicates an empty DataPair, which usually signals the end of an input strean 

FormData contains the path where the original file was placed (this path information is 
stored within the file), the email address of the user, and a ¥ector<DataPair> to hold the 
information. The operator[ ] allows you to perform a map-like lookup, just as in CGImap. 

Here are the definitions: 

// : CIO :FormData.cpp {01 
#include "FormData. h" 
#include ". ./require. h" 



getlinelin,ln) ; 

whilelln.findl"[{ [") == string :: npos ) 

if (!getline(in. In)) return *this; // End 
first = In.substrO, In . f ind ( " j } ] " ) - 3 ) ; 
getline(in. In); // Throw away [([ 
while (getline (in. In)) 

if (In.findl"] )]") == string: :npos) 
second += In + string (" " ) ; 



ormData: :FormData (char* f ileWame) { 
ifstream in (f ileName ) ; 
assure (in, fileName) ; 

require (getline (in, filePath) != 0); 
// Should be start of first line: 
require(filePath.find("///l"> == ) -' 
filePath = f ilePath . substr (strlen ("///{")) ; 
require (getline (in, email) != ) ; 

// Should be start of 2nd line: 
require(email.find("From[") == 0); 
int begin = strlen ( "From [") ; 
int end = email . find ("]'■) ; 
int length = end - begin; 
email = email . substr (begin, length); 
// Get the rest of the data: 
DataPair dp (in) ; 
while (dp) { 

push_back (dp) ; 
dp. get (in) ; 



tring FormData :: operator []( const strings key) { 
iterator i = begin () ; 
while(i != endl) ) | 

if ((*i) .first == key) 
return ( *■ i ) .second; 



ing () ; // Empty string == not found 
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void FormData : : dump (ostreamS os ) | 

OS « "filePath = " « filePath « endl; 
OS « "email = " « email « endl; 
for(iterator i = begin () ; i != endl); i + +> 

« (*i) .second « endl; 
} ///:- 

The DalaPair::get() function assumes you are using the same DataPairover and over 
(which is the case, in FormData: :FomiData( )) so it first calls erase( ) for itsflrst and 
second strings. Then it begins parsmg the lines for the key (which is on a single line and is 
denoted by the "[{[" and "])]") and the value (which may be on multiple lines and is denoted 
by a begin-marker of "[([" and an end-marker of "])]") which it places in the first and second 
members, respectively. 

The FormData constructor is given a file name to open and read. The FormData object 
always expects there to be a file path and an email address, so it reads those itself before 
getting the rest of the data as DataPairs. 

With these tools in hand, extracting the data becomes quite easy: 

// : CIO :FormDump.cpp 
//|L} FormData 
#include "FormData. h" 

#include ". ./require. h" 

int main lint argc, char*" argv [ ] ) { 
requireArgs large, 1) ; 
FormData fdlargv[l]); 
f d . dump I ) ; 

} ///:- 

The only reason that Process A pplication.cpp is busier is that it is building the email reply. 
Other than that, it just relies on FormData ; 

// : CIO :Proce3sApplication.cpp 
//{LI FormData 
#include "FormData. h" 
#include ". ./require. h" 
using namespace std; 



rom("Bruce@EckelOb jects 
eplyto ("Bruce@EckelOb jei 
asepath("/home/eckel") ; 



int main lint argc, char* argv[]) { 

requireArgs (argc, 1) ; 

FormData fd(argv[l]); 

char tfname[L_tmpnam] ; 

tmpnam(tfname) ; // Create a temporary file name 

string tempf ile (basepath + tfname + f d . email ) ; 

of stream reply (tempf ile . c_str ( ) ) ; 

assure (reply, tempf ile . c_str ( ) ) ; 

reply << "This message is to verify that you " 
"have been added to the list for the " 
« fd["subject-field"] « ". Your signup " 
"form included the following data; please " 
"ensure it is correct. You will receive " 
"further updates via email. Thanks for your " 
"interest in the class!" « endl; 

FormData: [iterator i ; 

for(i = fd.beginl); i != fd.endO; i + +) 
reply « (*i) .first « " = " 
« (*i) .second « endl; 

reply.closeO; 

// "fastmail" only available on Linux/Unix: 

string command ( "fastmail -F " + from + 

fd["subject-field"] + "\" " + 

tempfile + " " + fd. email); 
system(command.c_str ); // Wait to finish 
remove (tempf ile. c_str 1) ) ; // Erase the file 
} ///:- 

This program first creates a temporary file to build the email message in. Although it uses the 
Standard C library function tnipnani( ) to create a temporary file name, this program takes 
the paranoid step of assuming that, since there can be many instances of this program running 
at once, it's possible that a temporary name in one instance of the program could collide with 
the temporary name in another instance. So to be extra careful, the email address is appended 
onto the end of the temporary file name. 

The message is built, the DataPairs are added to the end of the message, and once again the 
LinuxAJnix fastmail command is built to send the information. An interesting note: if, in 
LinuxAJnix, you add an ampersand (&) to the end of the command before giving it to 
systeni( ), then this command will be spawned as a background process and systeni( ) will 
immediately return (the same effect can be achieved in Win32 with start). Here, no 
ampersand is used, so systein( ) does not return until the command is fmished - which is a 
good thing, since the next operation is to delete the temporary file which is used in the 
command. 
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The final operation in this project is to extract the data into an easily-usable form. A 
spreadsheet is a useful way to handle this kind of information, so this program will put the 
data into a form that's easily readable by a spreadsheet program: 

// : CIO iDataToSpreadsheet.cpp 
//{L( FormData 
#include "FormData . h " 
#include "../ require . h" 
#include <string> 



a fd(a 
fd.em, 



fd. begin ; i != fd.endO; 
t « (*i). second « delimite 



( III-." 

Common data interchange formats use various delimiters to separate fields of information. 
Here, a tab is used but you can easily change it to something else. Also note that I have 
checked for the "workshop-suggestions" field and specifically excluded that, because it tends 
to be too long for the information I want in a spreadsheet. You can make another version of 
this program that only extracts the "workshop -suggest ions" field. 

This program assumes that all the file names are expanded on the command line. Using it 
under LinuxrtJnix is easy since file-name global expansion ("globbing") is handled for you. 
So you say: 



In Win32 (at a DOS prompt) it's a bit more involved, since 
yourself: 

I For %f in (*.txt) do DataToSpreadshee 

This technique is generally useful for writing Win32/DOS o 
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Summary 
Exercises 



In Extract! iifo.cpp, change store( ) so it stores the data in comma- 
separated ASCII format 

(Til is exercise may require a little research and ingenuity, but you'll have a 
good idea of how server-side programming works when you're done.) Gain 
access lo a Web server somehow, even if you do so by installing a Web 
server that runs on your local machine (the Apache server is freely available 
from http://wwwApache.org and runs on most platforms). Install and test 
Extractlnfo.cpp as a CGI program, using INFOtest.html. 
Create a program called ExtractSuggestions.cpp that is a modification of 
DataToSpreadsheet.cpp which will only extract the suggestions along 
with the name and email address of the person that made them. 
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A: Recommended 
reading 



Thinking in C: Foundations for Java & C+-I-, by Chuck Allison (a MindView, Inc. Seminar 
on CD ROM, 1999, available at http://www.MindView.net). A course meludrng lectures and 
slides in the foundations of the C Language to prepare you to learn Java or C++. This is not an 
exhaustive course in C; only the necessities for moving on to the other languages are 
included. An extra section covering features for the C++ programmer is included. 
Prerequisite: experience with a high-level programming language, such as Pascal. BASIC, 
Fortran, or LISP. 

General C++ 

The C++ Programming Language, 3'^ edition. 1 1 1 ji n i S Iro ii il[i | (i J J isD i -V j ilt ) 

I HI J. T 1 iiD i iiim.lU pil if tit iiit liil pi'[i ui[Mtl) hlUij ii Id iIIk pi Id 



C++ Primer, 3'* Edition, by Stanley Lippman and Josee Lajoie (Addison -Wesley 1998). Not 
that much of a primer anymore; it's evolved into a thick book filled with lots of detail, and the 
one that 1 reach for along with Stroustrup's when trying to resolve an issue. Thinking in C++ 
should provide a basis for understanding the C++ Primer as well as Stroustrup's book. 

C & C++ Code Capsules, by Chuck Allison (Prentice-Hall, 1998). Assumes that you aheady 
know C and C++, and covers some of the issues that you may be rusty on, or that you may not 
have gotten right the first time. This book fills in C gaps as well as C++ gaps. 

The C++ ANSI/ISO Standard. This is itot free, unfortunately (I certamly didn't get paid for 
my time and effort on the Standards Committee — in fact, it cost me a lot of money). But at 
least you can buy the electronic form in PDF for only $18 at http://www.cssinfo.com . 



Large Scale C++ {?) by John Lakos. 

C++ Gems, Stan Lippman, editor. SIGS public; 

The Design & Evolution of C++, by Bjarne Stroustrup 

My own list of books 



[ am piittr Inkjliciiig If ill Piicil S C |S df-pii Uiit e J vii tit Eiiis in f rii t; (u ly iviilib 
vii tli( W Eb sik) 

Diiij C+ + 

[ ++ ImiJt S 111 

TlinUDg ii C+ + , I ''edition 

Black Belt C++, the Master's Collection (edited by Bruce Eckel} (out of print). 

Thinking in Java, 2"'' edition 

Depth & dark comers 

t Mli till •» I HI 1 lif li into lg; in o( th Iid;mh. u J hi; i g ii no id lii l)pi(jl ^i 
iihiiil ii JiidipiM [ + t piipiii !. 

[ Httlivi [ 1+ iH )l HI t ([(iliu C +1 , M S (HI il (jMS. 

I II iLiiig IS gg [ f I h i; gcgi' 1 U gg. 



The STL 
Design Patterns 
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This appendix contains files from Volume 1 that are 
required to build the files in Volume 2. 

//: :reqi]ire.h 

// Test for error conditions in programs 

// Local "using namespace std" for old compilers 

#ifndef IffiQUIRE_H 

#define IffiQUIIffi_H 

#include <cstdio> 

#include <cstdlib> 

inline void require (bool requirement, 

const char* msg = "Requirement failed") | 
using namespace std; 
if (! requirement) | 

fputs (msg, stderr); 

fputs("\n", stderr); 

exit (1) ; 



onst char* msg = "Must use %d arguments") { 

sing namespace std; 

if large != args + 1) { 

fprintf (stderr, msg, args); 

exit (1) ; 



nline void requireMinAri 






if(argc < minArgs + 1) { 

fprintf (stderr, msg, minArgs) ; 
fput3("\n", stderr); 

exit (1); 



const char* filename = "") { 
using namespace std; 
ifl!in) { 

fprintf (stderr, 

"Could not open file %s\n", filename); 

exit (1) ; 



const char-- filename = " " ) { 

ifllin) { 

fprintf (stderr, 

"Could not open file %s\n", filename); 
exit (1) ; 



#endif // REQUIRE_H III:- 

From Volume l,Chapter9: 

// : C0A:Stack4 .h 
// With inlines 
lifndef STACK4_H 
#define STACK4_H 
#include ". ./require. h" 

class Stack | 
struct Link | 

void*- data; 
Link*- next; 

Link(void'- dat. Link' nxt): 
data(dat), next (nxt) {] 
1* head; 



stack I head = 0; 1 
-StackO { 

require(head == 0, "Stack not empty"); 
1 
void push (void* dat) { 

head = new Link (dat, head) ; 
} 

void-- peekl) { return head->data; 1 
void* popl) { 

if (head == 0) return 0; 

void* result = head->data; 

Link* oldHead = head; 

head = head->next; 

delete oldHead; 

return result; 



Jtendif // STACK4_H ///:- 



// : COA:Dummy .cpp 
// To give the makefile 
// for this directory 
int mainl) {] ///:- 
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